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ABOUT THIS MANUAL 


This manual has been constructed as a detailed applications guide on the 
use of the IDT R372 1 and IDT73720 to construct appropriate DRAM subsystems 
for an R3051 family CPU. The manual has been written to describe a wide 
variety of memory subsystems. The manual has been written assuming that 
the system designer will primarily study the types of subsystems appropriate 
to the application at hand; itis not assumed that each system designer will read 
the manual in its entirety. 


In addition to the design information, the manual contains overview 
chapters on the DRAM controller (IDT R372 1), Bus Exchanger (IDT73720) and 
R3051 family bus interface. Also included is a brief review of DRAM 
fundamentals. 


A quantitative description of the R372 1 electrical interface is provided in the 
data sheet for this product. Also included in the data sheets are the mechanical 
descriptions of the part, including packaging and pin-out. 


Additional information on development tools, additional support chips, the 
R3051 family, and the use of these products in various applications, are 
provided in separate data sheets and applications notes. 


Any of this information is readily available from your local IDT sales 
representative. 


CONTENTS OVERVIEW 


Chapter 1 contains a brief overview of the capabilities of the R3721 DRAM 
controller. 


Chapter 2 contains a description of the R3051 family bus interface. 
Chapter 3 contains a brief overview of the fundamentals of DRAM operation. 
Chapter 4 describes how the R3721 DRAM controller operates. 


Chapter 5 describes how to program the R3721 to enable the various features 
and timing models it supports. 


Chapter 6 describes the various interfaces of the R3721, and describes how 
to connect it to the CPU, the DRAMs, the data path, and also how to use it with 
other memory subsystems. 


Chapter 7 describes the considerations involved in the construction of a non- 
interleaved DRAM subsystem. Various DRAM configurations are described. 


Chapter 8 provides a detailed analysis of a particular non-interleaved memory 
configuration. This chapter contains information on how to perform the timing 
analysis required to properly program the R3721 in such a system. 


Chapter 9 describes the considerations involved in the construction of an 
interleaved DRAM subsystem. 


Chapter 10 contains a detailed description of a particular interleaved memory 
configuration. This chapter also contains a detailed analysis of how to properly 
program the R3721 for an interleaved memory system. 


Chapter 11 describes the reset sequence, the refresh timing, and the clocking 
of the R3721. 


Appendix A describes the IDT73720 Bus Exchanger. 


Integrated Device Technology, Inc. reserves the right to make changes to its products or specifications at any time, without notice, in order to 
improve design or performance and to supply the best possible product. |DT does notassume any responsibility for use of any circuitry described 
other than the circuitry embodied in an IDT product. The Company makes no representations that circuitry described herein is free from patent 
infringement or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent, 
patent rights or other rights, of Integrated Device Technology, Inc. 


LIFE SUPPORT POLICY 

Integrated Device Technology's products are not authorized for use as critical components in life support devices or 

systems unless a specific written agreement pertaining to such intended use is executed between the manufacturer 

and an officer of IDT. 

1. Life support devices or systems are devices or systems which (a) are intended for surgical implant into the body 
or (b) support or sustain life and whose failure to perform, when properly used In accordance with instructions for 
use provided in the labeling, can be reasonably expected to result in a significant injury to the user. 

2. A critical component is any components of a life support device or system whose failure to perform can be 
reasonably expected to cause the failure of the life support device or system, or to affect Its safety or effectiveness. 


The IDT logo is a registered trademark and RiSController, R3051, and RiSChipset are trademarks of Integrated Device Technology, Inc. 
MIPS is a registered trademarks of MIPS Computer Systems, Inc. 

UNIX is a registered trademark of AT&T. 

All others are trademarks of their respective companies. 
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INTRODUCTION 

The R3721 is a member of the R3051™ system support RISChipset™. The 
R3721 isa Dynamic RAM Memory Controller, designed to offer the same levels 
of system flexibility as the R3051 family. 

The R3721 is responsible for translating between the R3051 family bus 
interface and the special control requirements of various DRAM based sub- 
systems. The R3721 performs all necessary handshaking and timing control. 
All that is required to implement a DRAM sub-system for the R3051 family is 
the R3721, DRAMs, an address decoder, and some transceivers for the data 
path. 

The R372 1 has been designed to enable systems to be implemented with field 
upgrade capabilities of their memory system. In order to upgrade to larger 
memory devices, or to increase the amount of memory, software merely needs 
to re-program the R3721 mode register at boot time. No complicated re-routing 
of address lines, nor modifications of the data path need to occur. Thus, as with 
the R3051 family, a single footprint and base design can offer a wide variety of 
end products, depending on the frequency of devices selected, the amount of 
memory installed, and the specific R3051 family CPU selected. 

The R372 1 is packaged using a low-cost 84-pin PLCC package, and supports 
a wide variety of DRAM-based sub-systems: 

e 256k x 1 through 4Mb x 4 DRAM devices 
1 to 4 banks of DRAM 
non-interleaved or two-way interleaved 
Direct control of DRAM data path transceivers 
Direct handshake with R3051 
Supports all bus transfers of the R3051 family 
DRAM access times of 100 ns or faster 
Supports page mode operation of DRAMs (either read or write) using on- 
chip page detector 
CAS-before-RAS refresh 
Capability to drive up to 36 DRAMs directly 
Directly controls x8, x9, x32, and x36 DRAM Memory Modules 
Highly programmable DRAM timing control for optimum performance 
Supports various memory address decoding schemes 


Figure 1.1 illustrates a block diagram of the R3721 DRAM controller. 
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DESCRIPTION 


The R372 1 DRAM controller contains all of the functional elements necessary 
to support the bus transaction requirements of the R3051 family. 

The R3721 connects directly to the R3051 bus, and captures address and 
control information from the bus as the R3051, ora DMA controller, drives it. 

The R3721 begins its transaction once its memory space is selected by an 
external address decoder. 

The R3721 will generate all of the DRAM control signal sequencing required: 

e Row address set-up to RAS asserted 

e Row address hold from RAS asserted 

e Column address set-up to CAS asserted 

e RAS to CAS delay 

e CAS to data valid (read) 

e WE to CAS set-up (write) 

In addition, the R372 1 will manage the transceiver-based data path interface, 
to properly control the flow of data between the CPU bus and the DRAM devices. 
The R3721 can either control standard FCT245 type transceivers (non- 
interleaved memory systems), or use the high-performance 73720 Bus 
Exchanger (for interleaved or banked memory systems). Figure 1.2 illustrates 
a typical system composed of the R3051, R3721 DRAM controller, and 73720 
bus exchanger. 

Finally, the R3721 will provide the proper acknowledgement back to the 
R3051, at the optimum time. That is, the R3721 will generate ACK and/or 
RdCEn, according to the timing model for the DRAMs and the type of transfer 
requested. 
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Figure 1.2 R3051-based System Using R3721 DRAM Controller 


The types of transactions performed include: 
e Single Datum Reads. The R3051 can request a single datum read (one 


to four bytes). The DRAM controller is capable of processing that read as 
a standard access or as a page mode access, depending on its locality to 
the preceding access. 

Single Datum Writes. The R3051 can perform single datum writes (one 
to four bytes). The DRAM controller will process that write as either a 
standard write access or a page mode write access, depending on its 
locality to the preceding access. The R3721 can use the WrNear output 
from the R3051 to provide a quick address decode, and can retire near 
writes in two cycles in almost any speed system. 

Quad Word Reads. The R3051 requests quad word reads in response to 
cache misses. The DRAM controller uses page mode in the DRAM to 
provide the data. If the memory sub-system is fast enough, the data will 
be returned in a true “burst” response (one word per clock cycle after 
initial latency). Otherwise, the R3721 will control RdCEn to “throttle” the 
read response. 
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e CAS-before-RAS Refresh. The R3721 contains all logic to automatically 
perform DRAM refresh. The R3721 uses simple CAS-before-RAS refresh 
signalling to perform DRAM refresh. 


CONFIGURABILITY 

The R3721 supports various memory configurations and speeds, as 
programmed into its on-chip, write-only mode register (Figure 1.3). The mode 
register allows the system designer to program the following system 
characteristics: 

e Speed of address decoding. 

e Refresh rate 

¢ CAS pre-charge time in page mode accesses 
CAS low time, which indicates data access time from assertion of CAS 
RAS wave form, including low and high times 
RAS to CAS delay required 
Memory configuration (interleaved, etc.) 

DRAM size, from 256k x 1 through 4Mb x 4. 

At boot time, the processor programs the DRAM controller according to the 
type of memory system connected. Note that the DRAM controller may be re- 
programmed; thus, it is possible to program in a “maximum case” value, 
perform memory diagnostics to determine the exact configuration, and then re- 
program the device according to the actual system configuration. This 
capability is included to allow the system to re-configure itself at boot time, to 
further support various field and manufacturing options for the given system. 

The DRAM controller performs all internal address shifting necessary to 
accommodate the various depths of DRAM. That is, all R3721’s are connected 
to the address bus in the same fashion, regardless of the DRAM organization; 
logic internal to the R3721 multiplexes R3051 address lines to the appropriate 
DRAM row or column address lines, according to the type of device under the 
R3721’s control. Again, this capability has been provided to allow field 
upgrades to higher density DRAM devices. 


PERFORMANCE CONSIDERATIONS 

The R3721 has been optimized to obtain maximum performance from low- 
cost, commodity DRAMs such as 1Mb 70-80ns 256k x 4 devices. Whereas 
discrete or PAL based DRAM control systems are typically limited to use only 
one edge of the processor SysClk output, the R3721 uses both clock edges of 
SysClk to provide higher granularity of timing in the DRAM sub-system and to 
achieve levels of performance that would be difficult and expensive to achieve 
in a discrete implementation. 
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For example, at 25MHz in a non-interleaved memory configuration, the 
R3721 can perform a standard read access in 5 cycles. Similarly, page mode 
writes are retired at the maximum processor rate of one write every two cycles. 

Quad word reads obtain data at the rate of one-word every two clock cycles. 
In a higher-performance memory configuration, interleaved memory can be 
used to increase the block refill rate to one word every clock cycle. This 
performance, coupled with the high cache hit rates inherent in the R3051 
family, allows system designers to build high-performance, low cost systems 
with a minimum of parts and design complexity. 

Thus, the combination of the R3051 family CPU and R3721 DRAM controller 
offers the system designer the ability to perform cost performance tradeoffs 
without absolutely crippling the performance of the end product. 


APPLICATIONS 

The R372 1 is a basic building block, which fits a broad range of applications, 
including graphics systems, laser printers, data communications, and other 
applications requiring a high-performance processor. The R3721 is designed 
to eliminate all glue logic and PALs from the DRAM control sub-system, easing 
design, reducing time-to-market, increasing performance, and lowering system 
cost. 
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The IDT R3051 family utilizes a simple, flexible bus interface to its external 
memory and I/O resources. The interface uses a single, multiplexed 32-bit 
address and data bus and a simple set of control signals to manage read and 
write operations. Complementing the basic read and write interface is a DMA 
Arbiter interface which allows an external agent to gain control of the memory 
interface to transfer data. This chapter provides an overview of the R3051 
memory interface; additional detail is found in the "R3051 Hardware User's 
Guide”. 

The R3051 family supports the following types of operations on its interface: 


Write Operations: The R3051 family utilizes an on-chip write buffer 
to isolate the execution core from the speed of external memory during 
write operations. The write interface of the R3051 family is thus designed 
to allow a variety of write strategies, from fast 2-cycle write operations 
through multiple wait-state writes. 

The R3051 family supports the use of fast page mode writes by 
providing an output indicator, WrNear, to indicate that the current write 
may be retired using a page mode access. This facilitates the rapid 
“flushing” of the on-chip write buffer, since the majority of processor 
writes will occur within a localized area of memory. 

Read Operations: The processor executes read operations as the 
result of either a cache miss or an uncacheable reference. As with the 
write interface, the read interface has been designed to accommodate a 
wide variety of memory system strategies. There are two types of reads 
performed by the processor: 

Quad word reads occur when the processor requests a contiguous 
block of four words from memory. Bursts occur in response to instruction 
cache misses, and may occur in response to a data cache miss. The 
processor incorporates an on-chip 4-deep read buffer which may be used 
to “queue up” the read response before passing it through to the high- 
bandwidth cache and execution core. Read buffering is appropriate in 
systems which require wait states between adjacent words of a block read. 
On the other hand, systems which use high-bandwidth memory techniques 
(such as memory interleaving) can effectively bypass the read buffer by 
providing words of the block at the processor clock rate. Note that the 
choice of burst vs. read buffering is independent of the initial latency of 
the memory; that is, burst mode can be used even if multiple wait states 
are required to access the first word of the block. 

Single word reads are used for uncacheable references (such as I/O or 
boot code) and may be used in response to a data cache miss. The 
processor is capable of retiring a single word read in as few as two clock 
cycles. 

DMA Operations: The R3051 family includes a DMA arbiter which 
allows an external agent to gain full control of the processor read and write 
interface. DMA is useful in systems which need to move significant 
amounts of data within memory (e.g. BitBIT operations) or move data 
between memory and I/O channels. 
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R3051 BUS INTERFACE PIN DESCRIPTION 

This section describes the signals used in the above interfaces. Note that 
many of the signals have multiple definitions which are de-multiplexed either 
by the ALE signal or the Rd and Wr control signals. Note that signals indicated 
with an overbar are active low. 


Address and Data Path 


A/D(31:0) I/O 

Address/Data: A 32-bit, time multiplexed bus which indicates the desired 
address for a bus transaction in one cycle, and which is used to transmit data 
between this device and external memory resources on other cycles. 

Bus transactions on this bus are logically separated into two phases: during 
the first phase, information about the transfer is presented to the memory 
system to be captured using the ALE output. This information consists of: 

Address(31:4): The high-order address for the transfer is presented. 

BE(3:0): These strobes indicating which bytes of the 32-bit bus 

will be involved in the transfer. BE(3) indicates that 
AD(31:24) is used; BE(2) indicates that AD(23:16) is 
used; BE(1) indicates that AD(15:8) is used; and BE(0) 
indicates that AD(7:0) is used. 








During write cycles, the bus contains the data to be stored and is driven from 
the internal write buffer. On read cycles, the bus receives the data from the 
external resource, in either a single word transaction or ina burst of four words, 
and places it into the on-chip read buffer. 


Addr(3:2) O 

Low Address (3:2) A 2-bit bus which indicates which word is currently 
expected by the processor. Specifically, this two bit bus presents either the 
address bits for the single word to be transferred (writes or single word reads) 
or functions as a two bit counter starting at ‘00’ for burst read operations. 


Read and Write Control Signals 


ALE O 

Address Latch Enable: Used to indicate that the A/D bus contains valid 
address information for the bus transaction. This signal is used by external 
logic (transparent latches) to capture the address for the transfer. 


DataEn O 

Data Input Enable: This signal indicates that the AD bus is no longer being 
driven by the processor during read cycles, and thus the external memory 
system may enable the drivers of the memory system onto this bus without 
having a bus conflict occur. During write cycles, or when no bus transaction 
is occurring, then this signal is negated. 
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Burst / 
WrNear O 

Burst Transfer: On read transactions, this signal indicates that the current 
bus read is requesting a block of four contiguous words from memory (a burst 
read). This signal is asserted only in read cycles due to cache misses; it is 
asserted for all I-Cache miss read cycles, and for D-Cache miss read cycles if 
selected at device reset time. 

Write Near: On write transactions, this output tells the external memory 
system that the bus interface unit is performing back-to-back write transactions 
to an address within the same 256 entry memory “page” as the prior write 
transaction. This signal is useful in memory systems which employ page mode 
or static column DRAMs. 


Rd O 
__ Read: An output which indicates that the current bus transaction is a read. 
Wr O 


Write: An output which indicates that the current bus transaction isa write. 


ACK I 

Acknowledge: An input which indicates to the device that the memory 
system has sufficiently processed the bus transaction, and that the processor 
may either advance to the next write buffer entry (writes) or release the 
execution core to process the read data (reads). 


RdCEn I 

Read Buffer Clock Enable: An input which indicates to the device that the 
memory system has placed valid data on the AD bus, and that the processor 
may move the data into the on-chip Read Buffer. 


BusError I 

Bus Error: Input to the bus interface unit to terminate a bus transaction 
due to an external bus error. This signal is only sampled during read and write 
operations. If the bus transaction is a read operation, then the CPU will also 
take a bus error exception. 





CHAPTER 2 R3051 FAMILY INTERFACE OVERVIEW 





Status Information 


Diag(1) O 

Diagnostic Pin 1. This output indicates whether the current bus read 
transaction is due to an on-chip cache miss, and also presents part of the miss 
address. The value output on this pin is time multiplexed: 

Cached: During the phase in which the A/D bus presents 
address information, this pin is an active high output 
which indicates whether the current read is a result of 
a cache miss. The value of this pin at this time in other 
than read cycles is undefined. 

Miss Address (3): During the remainder of the read operation, this output 
presents address bit (3) of the address the processor was 
attempting to reference when the cache miss occurred. 
Regardless of whether a cache miss is being processed, 
this pin reports the transfer address during this time. 


Diag(0) OO 

Diagnostic Pin 0. This output distinguishes cache misses due to instruction 
references from those due to data references, and presents the remaining bit 
of the miss address. The value output on this pin is also time multiplexed: 

I/D: If the “Cached” Pin indicates a cache miss, then a high 
on this pin at this time indicates an instruction reference, 
and a low indicates a data reference. If the read is not 
due to a cache miss but rather an uncached reference 
(“Cached” is negated), then this pin is undefined during 
this phase. 

Miss Address (2): During the remainder of the read operation, this output 
presents address bit (2) of the address the processor was 
attempting to reference when the cache miss occurred. 
Regardless of whether a cache miss is being processed, 
this pin reports the transfer address during this time. 


DMA Arbiter Interface 


These signals are involved when the processor exchanges bus mastership 
with an external agent. 
BusReq I 

DMA Arbiter Bus Request: An input to the device which requests that the 
processor tri-state its bus interface signals so that they may be driven by an 
external master. The negation of this input releases the bus back to the 
R3051/52. 


BusGnt O 

DMA Arbiter Bus Grant. An output from the R3051/52 used to acknowledge 
that a BusReq has been detected, and that the bus is relinquished to the 
external master. 
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READ TRANSACTIONS 


The majority of the execution engine read requests are never seen at the 
memory interface, but rather are satisfied by the internal cache resources of 
the processor. Only in the cases of uncacheable references or cache misses do 
read transactions occur on the bus. 

In general, there are only two types of read transactions: quad word reads 
and single word reads. Note that partial word reads of less than 32-bits can 
be thought of as a simple subset of the single word read, with only some of the 
byte enable strobes asserted. 

Quad word reads occur only in response to cache misses. All instruction 
cache misses are processed as quad word reads; data cache misses may be 
processed as quad word reads or single word reads, depending on the mode 
selection made during reset initialization of the processor. 

In processing reads, there are two parameters of interest. The first 
parameter is the initial latency to the first word of the read. This latency is 
influenced by the overall system architecture as well as the type of memory 
system being addressed: time required to perform address decoding, and 
perform bus arbitration, memory pre-charge requirements, and memory 
control requirements, as well as memory access time. The initial latency is the 
only parameter of interest in single word reads. 

The second parameter of interest (only in quad word refills) is the repeat rate 
of data; that is, time required for subsequent words to be processed back to the 
processor. Factors which influence the repeat rate include the memory system 
architecture, the types and speeds of devices used, and the sophistication of 
the memory controller: memory interleaving, the use of faster devices serves 
to increase the repeat rate (minimize the amount of time between adjacent 
words). 

The R3051 family has been designed to accommodate a wide variety of 
memory system designs, including no wait state operations (first word available 
in two cycles) and true burst operation (adjacent words every clock cycle), 
through simpler, slower systems incorporating many bus wait states to the first 
word and multiple clock cycles between adjacent words (this is accomplished 
by use of the on-chip read buffer). The R372 1 DRAM controller supports these 
various schemes, according to the memory configuration under its control. 


READ INTERFACE TIMING OVERVIEW 

The read interface is designed to allow a variety of memory strategies. An 
overview of how data is transmitted from memory and I/O devices to the 
processor is discussed below. Note that multiplexing the address and data bus 
does not slow down read transactions: the address is on the A/D bus for only 
one-half clock cycle, so the data drivers can be enabled quickly; memory and 
I/O devices initiate their transfers based on addressing and chip enable, not 
on the availability of the bus. Thus, memory does not need to “wait” for the bus, 
and no performance penalty occurs. 


Memory Addressing 

Aread transaction begins when the processor asserts its Rd control output, 
and also drives the address and other control information onto the A/D and 
memory interface bus. Figure 2.1 illustrates the start of a processor read 
transaction, including the addressing of memory and the bus turn around. 

The addressing occurs in a half-cycle of the SysCIk output. At the rising edge 
of SysClk, the processor will drive the read target address onto the A/D bus. 
At this time, ALE will also be asserted, to allow an external transparent latch 
to capture the address. Depending on the system design, address decoding 
could occur in parallel with address de-multiplexing (that is, the decoder could 
start on the assertion of ALE, and the output of the decoder captured by ALE), 
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Figure 2.1 Start of Processor Read 


or could occur on the output side of the transparent latches. During this phase, 
DataEn will be held high indicating that memory drivers should not be enabled 
onto the A/D bus. 

Concurrent with driving addresses on the A/D bus, the processor will 
indicate whether the read transaction is a quad word read or single word read, 
by driving Burst to the appropriate polarity (low for a quad word read). Ifa quad 
word read is indicated, the Addr(3:2) lines will drive ‘00’ (the start of the block); 
if a single word (or subword) is indicated, the Addr(3:2) lines will indicate the 
word address for the transfer. The functioning of the counter during quad 
words is described later. 





Bus Turn Around 

Once the A/D bus has presented the address for the transfer, it is “turned 
around” by the processor to accept the incoming data. This occurs in the 
second phase of the first clock cycle of the read transaction, as illustrated in 
Figure 2.1. 
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The processor turns the bus around by carefully performing the following 

sequence of events: 

e It negates ALE, causing the transparent address latches to capture the 
contents of the A/D bus. 

e It disables its output drivers on the A/D bus, allowing it to be driven by 
an external agent. The processor design guarantees that the ALE is 
negated prior to tri-stating the A/D bus. 

¢ The processor then asserts DataEn, to indicate that the bus may be now 
driven by the external memory resource. The processor design insures 
that the A/D bus is released prior to DataEn being asserted. DataEn may 
be directly connected to the output enable of external memory, and no bus 
conflicts will occur. 


Thus, the processor A/D bus is ready to be driven by the end of the second 
phase of the read transaction. At this time, it begins to look for the end of the 
read cycle. 


Bringing Data into the Processor 

Regardless of whether the transfer is a quad word read or a single word 
transfer, the basic mechanism for transferring data presented on the A/D bus 
into the processor is the same. 

Although there are two control signals involved in terminating read operations, 
only the RdCEn signal is used to cause data to be captured from the bus. 

The memory system asserts RdCEn to indicate to the processor that it has 
(or will have) data on the A/D bus to be sampled. The earliest that RdCEn can 
be detected by the processor is the rising edge of SysClIk after it has turned the 
bus around (start of phase 1 of the second clock cycle of the read). 

If RdCEn is detected as asserted (with adequate setup and hold time to the 
rising edge of SysClk), the processor will capture (with proper setup and hold 
time) the contents of the A/D bus on the immediately subsequent falling edge 
of SysClk. This captures the data in the internal read buffer for later processing 
by the execution core/cache subsystem. 

Figure 2.2 illustrates the sampling of data by an R3051/52. 
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Figure 2.2 Data Sampling by R3051 
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Terminating the Read 
There are actually three methods for the external memory system to 
terminate an ongoing read operation: 


e It can supply an ACK (acknowledge) to the processor, to indicate that it 
has sufficiently processed the read request and has or will supply the 
requested data in a timely fashion. Note that ACK may be signalled to the 
processor “early”, to enable it to begin processing the read data even while 
additional data is brought from the A/D bus. This is applicable only in 
burst read operations. 


e It can supply a BusError to the processor, to indicate that the requested 
data transfer has “failed” on the bus, and force the processor to take a bus 
error exception. Although the system interface behavior of the processor 
when BusError is presented is identical to the behavior when ACK is 
presented, no data will actually be written into the on-chip cache. Rather, 
the cache line will either remain unchanged, or will be invalidated by the 
processor, depending on how much of theread has already been processed. 


e The external memory system can supply the requested data, using RdCEn 
to enable the processor to capture data from the bus. The processor will 
“count” the number of times RdCEn is sampled as asserted; once the 
processor counts that the memory system has returned the desired 
amount of data (one word or four words), it will implicitly “acknowledge” 
the read at the same time that it samples the last required RdCEn. This 
approach leads toa simpler memory design at the cost oflower performance. 


The R3721 always uses a properly timed ACK to terminate quad word reads. 

There are actually two phases of terminating the read: there is the phase 
where the memory system indicates to the processor that it has sufficiently 
processed the read request, and the internal read buffer can be released to 
begin refilling the internal caches; and there is the phase in which the read 
control signals are negated by the processor bus interface unit. The difference 
between these phases is due to block refill: the R3721 “releases” the execution 
core even though additional words of the block are still required; in that case, 
the processor will continue to assert the external read control signals until all 
four words are brought into the read buffer, while simultaneously refilling/ 
executing based on the data already brought on board. 

Figure 2.3 shows the timing of the control signals when the read cycle is 
being terminated. 


READ TIMING DIAGRAMS 

This section illustrates a number of timing diagrams applicable to R3051 
family read transactions. These diagrams reference AC parameters whose 
values are contained in the R3051/52 data sheet. 


Single Word Reads 

Figure 2.4 illustrates the case of a single word read. In this figure, two bus 
wait cycles were required before the data was returned. Thus, two rising edges 
of SysClk occurred where neither RdCEn or ACK were asserted. On the third 
rising edge of SysClk, RdCEn was asserted. Optionally, ACK could also be 
asserted at this time, although it is not strictly necessary. 


Quad Word Reads 

Figure 2.5 (a, b) illustrates a block read in which bus wait cycles are ccauned 
before the first word is brought to the processor, but in which additional words 
can be brought in at the processor clock rate. Thus, as with the no wait cycle 
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operation, ACK is returned simultaneously with the first RdCEn. Figure 2.5 
(a) illustrates the start of the block read, including initial wait cycles to the first 
word; Figure 2.5 (b) illustrates the activity which occurs as data is brought onto 
the chip and the read is terminated. The use of memory interleaving in the 
DRAM subsystem allows true burst operation. 

Figure 2.6 (a, b) illustrates a block read in which bus wait cycles are required 
before the first word is returned, and in which wait cycles are required between 
subsequent words: Figure 2.6 (a) illustrates the first two words of the block 
being brought on chip; Figure 2.6 (b) illustrates the last two words of the read, 
including the optimum timing of ACK, and the negation of the read control 
signals. 

In this diagram, the R372 1 returns ACK according to when the processor will 
empty the read buffer. The R3721 determines the optimal cycle to assert ACK, 
allowing the CPU to restart even while datais read from the DRAMs. The timing 
of ACK insures that the last data word is returned to the processor before it is 
emptied from the read buffer. 
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Figure 2.4 Single Word Read Cycle 
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Figure 2.5 (a) Start of Burst Quad Word Read 
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Figure 2.5 (b) End of Burst Quad Word Read 
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Figure 2.6 (a) Start of Throttled Quad Word Read 
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Figure 2.6 (b) End of Throttled Quad Word Read 
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WRITE INTERFACE 
The design goal of the write interface was to achieve two things: 

Insure that a relatively slow write cycle does not unduly degrade the 
performance of the processor. To this end, a four deep write buffer has 
been incorporated on chip. The role of the write buffer is to decouple the 
speed of the memory interface from the speed of the execution engine. The 
write buffer captures store information (data, address, and transaction 
size) from the processor at its clock rate, and later presents it to the 
memory interface at the rate it can perform the writes. Four such buffer 
entries are incorporated, thus allowing the processor to continue execution 
even when performing a quick succession of writes. Only when the write 
buffer is filled must the processor stall; simulations have shown that 
significantly less than 1% of processor clock cycles are lost to write buffer 
full stalls. 

Allow the memory system to optimize for fast writes. To this end, a 
number of design decisions were made: the WrNear signal is provided to 
allow page mode writes to be used in even simple memory systems; the A/ 
D bus presents the data to be written in the second phase of the first clock 
cycle of a write transaction; and writes can be performed in as few as two 
clock cycles. 


All though it may be counter-intuitive, a significant percentage of the bus 
traffic will in fact be processor writes to memory. This can be demonstrated if 
one assumes the following: 

Instruction Mix: 
ALU Operations 55% 
Branch Operations 15% 
Load Operations 20% 
Store Operations 10% 

Cache Performance 
Instruction Hit Rate 98% 
Data Hit Rate 96% 


Under these assumptions, in 100 instructions, the processor would 
perform: 
2 Reads to process instruction cache misses on instruction fetches 
4% x 20 = 0.8 reads to process data cache misses on loads 
10 store operations to the write through cache 
Total: 2.8 reads and 10 writes 


Thus, in this example, over 75% of the bus transactions are write operations, 
even though only 10 instructions were store operations, vs. 100 instruction 
fetches and 20 data fetches. Thus, it is appropriate to optimize the DRAM 
subsystem for page mode write operations. 


TYPES OF WRITE TRANSACTIONS 

Unlike instruction fetches and data loads, which are usually satisfied by the 
on-chip caches and thus are not seen at the bus interface, all write activity is 
seen at the bus interface as single write transactions. There is no such thing 
as a “burst write”; the processor performs a word or subword write as a single 
autonomous bus transaction; however, the WrNear output does allowsuccessive 
write transactions to be processed using page mode writes. This is particularly 
important when “flushing” the write buffer before performing a data read. 

In processing writes, there is only one parameter of interest: the latency of 
the write. This latency is influenced by the overall system architecture as well 
as the type of memory system being addressed: time required to perform 





CHAPTER 2 R3051 FAMILY INTERFACE OVERVIEW 





address decoding and bus arbitration, memory pre-charge requirements, and 
memory control requirements, as well as memory access time. WrNear may be 
used to reduce the latency of successive write operations. In addition, WrNear 
may be used to bypass the address decoder; if the memory controller retired 
the last transaction, and WrNear is asserted for this write, then obviously that 
memory controller will also be responsible for this transaction, and the system 
does not need to wait for the output of the address decoder. 

The R3051 family has been designed to accommodate a wide variety of 
memory system designs, including no wait cycle operations (write completed 
in two cycles) through simpler, slower systems incorporating many bus wait 
cycles. | 








WRITE INTERFACE TIMING OVERVIEW 
The protocol for transmitting data from the processor to memory and I/O 
devices is discussed below. 


Memory Addressing 

A write transaction begins when the processor asserts its Wr control output, 
and also drives the address and other control information onto the A/D and 
memory interface bus. Figure 2.7 illustrates the start of a processor write 
transaction, including the addressing of memory and presenting the store data 
on the A/D bus. 

The addressing occurs ina half-cycle of the SysClk output. At the rising edge 
of SysClk, the processor will drive the write target address onto the A/D bus. 
At this time, ALE will also be asserted, to allow an external transparent latch 
to capture the address. Depending on the system design, address decoding 
could occur in parallel with address de-multiplexing (that is, the decoder could 
start on the assertion of ALE, and the output of the decoder captured by ALE), 
or could occur on the output side of the transparent latches. During this phase, 
WrNear will also be determined and driven out by the processor. 


Data Phase 
Once the A/D bus has presented the address for the transfer, the address 
is replaced on the A/D bus by the store data. This occurs in the second phase 
of the first clock cycle of the write transaction, as illustrated in Figure 2.7. 
The processor enters the data phase by performing the following sequence 
of events: 
e It negates ALE, causing the transparent address latches to capture the 
contents of the A/D bus. 
¢ It internally captures the data in a register in the bus interface unit, and 
enables this register onto its output drivers on the A/D bus. The 
processor design guarantees that the ALE is negated prior to the address 
being removed from the A/D bus. 


Thus, the processor A/D bus is driving the store data by the end of the 
second phase of the write transaction. At this time, it begins to look for the end 
of the write cycle. 
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There are only two methods for the external memory system to terminate a 

write operation: 

e It can supply an ACK (acknowledge) to the processor, to indicate that it 
has sufficiently processed the write request, and the processor may 
terminate the write. 

e It can supply a BusError to the processor, to indicate that the requested 
data transfer has “failed” on the bus. The system interface behavior of the 
processor when BusError is presented is identical to the behavior when 
ACK is asserted. In the case of writes terminated by BusError, no 
exception is taken, and the data transfer cannot be retried. 


Figure 2.8 shows the timing of the control signals when the write cycle is 
being terminated. 


WRITE TIMING DIAGRAMS 
This section illustrates a basic write from a R3051 family CPU. The values 
for the AC parameters referenced are contained in the R3051 family data sheet. 


Basic Write 

Figure 2.9 illustrates the case of a basic write. In this figure, two bus wait 
cycles were required before the data was retired. Thus, two rising edges of 
SysClk occurred where ACK was not asserted. On the third rising edge of 
SysCik, ACK was asserted, and the write operation was terminated. 
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Figure 2.8 End of Write 
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Figure 2.9 Basic Write 
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DMA ARBITER INTERFACE 

The R3051 family contains provisions to allow an external agent to remove 
the processor from its memory bus, and thus perform transfers (DMA). These 
provisions use the DMA arbiter to coordinate the external request for mastership 
with the CPU read and write interface. 

The DMA arbiter interface uses a simple two signal protocol to allow an 
external agent to obtain mastership of the external system bus. Logic internal 
to the CPU synchronizes the external interface to the internal arbiter unit to 
insure that no conflicts between the internal synchronous requesters (read and 
write engines) and external asynchronous (DMA) requester occurs. 


INTERFACE OVERVIEW 

An external agent indicates the desire to perform DMA requests by asserting 
the BusReq input to the processor. DMA requests have the highest priority, 
and thus, once the request is detected, is guaranteed to gain mastership at the 
next arbitration. 

The CPU indicates that the external DMA cycle may begin by asserting its 
BusGnt output on the rising edge of SysClk after BusReq is detected with 
appropriate set-up time to the external rising edge of SysClk. During DMA 
cycles, the processor holds the following memory interface signals in tri-state: 

e A/D Bus 

¢ Addr(3:2) 

e Interface control signals: Rd, Wr, DataEn, Burst/WrNear, and ALE 

e¢ Diag(1:0) 


In addition to tri-stating these signals, the CPU will ignore transitions on 
RdCEn, ACK, and BusError during DMA cycles. 

Thus, the DMA master can use the same memory control logic as that used 
by the CPU; it may use Burst, for example, to obtain a burst of data from the 
memory; it may use RdCEn to detect whether the memory has satisfied its 
request, etc. Thus, DMA can occur at the same speed at which the R3051 
family allows data transfers on its bus (a peak of one word per clock cycle). 
During DMA cycles, the processor will continue to operate out of cache until 
it requires the bus. 

The external agent indicates that the DMA transfer has terminated by 
negating the BusReq input to the processor, which is sampled on the rising 
edge of SysClk. BusGnt is negated on a falling edge of SysClk, so that it will 
be negated before the assertion of Rd or Wr for a subsequent transfer. On the 
next rising edge of SysClk, the processor will resume driving tri-stated signals. 

Note that there is no hardware coherency mechanism defined for DMA 
transfers relative to either the internal caches or the write buffer. Software 
must explicitly manage DMA transfers to insure that data conflicts are avoided. 
This is an appropriate trade-off for the vast majority of embedded applications. 





DMA ARBITER TIMING DIAGRAMS 


These figures reference AC timing parameters whose values are contained 
in the R3051 family data sheet. 


Initiation of DMA Mastership 

Figure 2.10 shows the beginning of a DMA cycle. Note that if BusReq were 
asserted while the processor was performing a read or write operation, BusGnt 
would be delayed until the next bus slot after the read or write operation is 
completed. 
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Toinitiate DMA, the processor must detect the assertion of BusReq with proper 
set-up time to SysCIk. Once BusReq is detected, and the bus is free, the 
processor will grant control to the requesting agent by asserting its BusGnt 
output, and tri-stating its output drivers, from a rising edge of SysCIk. The bus 
will remain in the control of the external master until it negates BusReq, 
indicating that the processor is once again the bus master. 


Relinquishing Mastership Back to the CPU 

Figure 2.11 shows the end of a DMA cycle. The next rising edge of SysClk 
after the negation of BusReq is sampled may actually be the beginning of a 
processor read or write operation. 

To terminate DMA, the external master must negate the processor BusReq 
input. Once this is detected (with proper setup and hold time), the processor 
will negate its BusGnt output on the next falling edge of SysClIk. It will also re- 
enable its output drivers. Thus, the external agent must disable its output 
drivers by this clock edge, to avoid bus conflicts. 
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Figure 2.10 DMA Mastership Request 
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DRAM OPERATION 
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INTRODUCTION 

The R3051 typically executes out of cache. However, if the data or 
instruction it is attempting to fetch is not available from the on-chip caches, 
it must be fetched from main memory, as shown in Figure 3.1. Also, since the 
R3051 cache uses a write-through policy, all writes appear on the memory bus. 

The effect of main memory on CPU performance depends on both the cache 
size, and the program running on the processor. The locality of memory 
references and the number of writes will vary with each program. The effect 
of main memory on CPU performance will be more pronounced for programs 
with either low locality, a large proportion of writes, or a system with a small 
cache. 

Typical cost-sensitive embedded applications utilize DRAMs for the processor 
main memory, based on the density/cost/power of these devices. However, 
DRAMs require special control in order to access their contents. These control 
actions are performed by the R3721. In order to better understand the R372 1, 
a basic understanding of the DRAM control requirements is required. Those 
familiar with the basics of DRAM may skip this chapter. 


DRAM ARCHITECTURE 

There are two common configuration of DRAMs: “Separate I/O”, and 
"Common I/O". Separate I/O devices typically are "x1" DRAMs (a single data 
bit of I/O). These store 1 bit of data in N different locations and provide two 
pins, Dand Q, for input and output data. Other DRAM configurations are "x4" 
and "x8", and use acommon I/O structure. To conserve package size these use 
bi-directional data pins for both input and output data. Figure 3.1 shows the 
internal organization of a "x1" DRAM with separate I/O; Figure 3.2 shows the 
internal organization of a "x4" DRAM with common I/O. 

In order to achieve high-density DRAMs are implemented using a capacitive 
memory cell. While this memory cell can be implemented as a much smaller 
cell than a typical SRAM cell, these memory cells do have a capacitive time 
constant whereby they discharge their value over a relatively small time. Thus, 
DRAMs must be refreshed periodically, as described below. 

Further, DRAMs exhibit a "pre-charge" requirement for both RAS and CAS. 
During this time, internal bit lines are precharged back to a high level prior to 
sampling or writing a particular bit cell. The requirements of the multiplexed 
address interface, pre-charge and refresh requirements, and the timing 
associated with general DRAM control make the design of a DRAM subsystem 
relatively more complex than a standard SRAM or EPROM subsystem. The 
R3721 manages all of these requirements in a R3051 based system. 

DRAMs are available today in various densities, speeds, and modes. 
Densities vary from 256 Kbits to 4 Mbits, with 1 Mbit devices being common 
due to price and density. The selection of density and depth depends upon the 
size of main memory and the number of banks supported. Since the R3051 is 
a 32-bit-wide CPU, each DRAM bank must be 32 bits wide. For example, when 
using 4M-by-1 devices, up to 16 Mbytes of main memory can be supported per 
bank. Conversely, when using 256K-by-4 only 1 Mbyte of memory can be 
supported per bank. 
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Figure 3.2 Common I/O "x4" DRAM 
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DRAM speeds are typically measured as address-to-data access times, 
which can vary from 150ns to 60ns and faster. Besides offering various access 
times, DRAMs are also available in various data access modes, such as Nibble 
mode, Page Mode, and Static Column mode. 


Normal Access 


Accessing data in DRAM differs from accessing data in an SRAM. DRAM 
uses a multiplexed addressing arrangement, which allows for a smaller 
package size but increases the complexity of the interface. The system must 
provide a row address followed by a column address. The row address is 
latched by the Row Address Strobe (RAS), while the column address is latched 
by the Column Address Strobe (CAS). To meet pre-charge requirements, both 
RAS and CAS must be kept inactive for a minimum pre-charge delay after a 
read cycle. Figure 3.3 shows DRAM access timing. 

Note that this multiplexed address also corresponds well to the internal 
organization of the DRAM. 


Page Mode and Static Column Accesses 


Both Static Column and Page Mode DRAMs support a normal (slow) first 
access in which RAS and CAS are asserted as in a normal DRAM. At the end 
of the cycle, RAS and/or CAS can stay low, thus eliminating both the pre- 
charge requirement and the address multiplexing delay for subsequent 
accesses. In page-mode devices row and column addresses are strobed into the 
DRAM on the first access, and only CAS needs to be recycled for subsequent 
accesses which share the same Row address as the first access. However the 
setup, hold, transition, and pre-charge times for CAS must still be met. In 
Static Column mode the setup, hold, transition and pre-charge delays associated 
with CAS are eliminated by keeping CAS as well as RAS low after the first 
access. Only the column address needs to be changed for subsequent 
accesses. However, in Static Column mode the output buffers of the DRAM are 
enabled, thus consuming more power than Page Mode devices. 

The use of page mode or static column mode allows high-bandwidth from 
relatively slow devices. In order to reduce power consumption of the system 
while maintaining high performance, the R3721 uses page mode accesses 
rather than static column. Using page mode allows fast writes, as well as high- 
performance single and quad word reads. 

Figure 3.4 shows a series of DRAM page mode accesses. 


DRAM Refresh and Pre-charge 


Since DRAMSs use a capacitive device to hold a memory bit, each DRAM cell 
must be periodically refreshed to ensure the cell does not lose its charge. Most 
DRAMs require a complete refresh every 4 to 8 ms. Most DRAMs reserve the 
most significant bit as a select bit for an internal multiplexer that selects data 
in two banks of arrays; in these DRAMs, the lower address bits access memory 
cells in both banks simultaneously. This halves the number of refresh cycles 
needed. For example, a DRAM with 1024 rows now only needs 512 refreshes 
within the refresh period. 
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Figure 3.3 Normal DRAM Access 
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Figure 3.4 Page Mode DRAM Access 


A variety of methods are used for refreshing DRAMs; the two most commonly 
used are CAS-Before-RAS and RAS-only refresh. The CAS-before-RAS method 
does not require an externally-generated Row address; instead, the DRAM uses 
its own address counters. With RAS-only refresh the refresh address must also 
be generated externally and thus requires additional components for the refresh 
address counter. Since virtually all modern DRAMs support CAS-before-RAS 
refresh, this is the technique used by the R3721. 

Another requirement of DRAM is pre-charge after every read operation. 
Performing a memory-read operation from a DRAM cell causes its capacitor to 
discharge slightly. To remedy this, data must be written back into the cell after 
each read operation. This write-back operation is called pre-charge and is 
automatically handled by the DRAM. However, the DRAM controller must insure 
that the DRAM RAS and CAS pre-charge timing requirements are met for proper 
operation. 
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MEMORY SYSTEM CONFIGURATIONS 

A DRAM memory system can be designed in a number of different ways. Two 
of the major ways are: interleaved and non-interleaved. 

An interleaved memory system works by dividing the memory system into 
two or more arrays. Consecutive addresses are distributed amongst the 
arrays: for example, in a two-way interleaved memory system, even word 
addresses reside within one bank, while odd word addresses reside in another 
bank. 

Interleaved memory systems offer the advantage of higher bandwidth during 
multi-word transactions. For example, if, by using page mode, a particular 
system can read a new value from each DRAM once every 80ns (including 
access time, prop delay, set-up time, and CAS precharge), then using two way 
interleaving allows two new data values to be read each 8Ons, or a new value 
every 40ns. Fora 25MHz CPU, this is the maximum datarate of the CPU. Thus, 
two way interleaving is used to double the bandwidth from the memory to the 
CPU. Figure 3.5 illustrates a two-way interleaved memory system. The concept 
of interleaving can be extended to 4, 8, or n-way interleaved memory systems. 

The disadvantage of interleaved memory is that it requires a multiple 32-bit 
DRAM arrays, each with independent 32-bit data busses that are transceivered/ 
multiplexed onto a single CPU data bus. Thus, the minimum system 
configuration includes more memory (and thus cost) than the minimum 
configuration ofa non-interleaved memory system. In addition, amemory data 
path for each memory array must be provided. 







R3051 
Family CPU 





CPU 
Address/Data Bus 





oe DRAM 
oe 


Odd Even 
Bank Bank 


| eS lchteee 73720 
Bus | eS lchteee 











Figure 3.5 Interleaved Memory System 
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Independent of memory interleaving, multiple banks of memory can be 
under the control of a single DRAM controller and can use a single memory to 
CPU data path. In an interleaved system, each "bank" actually contains 
multiple 32-bit arrays of memory (for example, in a two-way interleaved system 
with two banks, there are two pairs of even and odd memory arrays; both even 
arrays use a single data path). While distinguishing between even and odd 
arrays in an interleaved memory system uses low order processor address bits, 
distinguishing between multiple banks of memory uses high-order bits. For 
the R3051 family, a quad word read will never cause a "bank crossing", and 
thus the relatively slow output enable/disable time of DRAMs is nota problem. 

One might wonder why a system which includes two banks of DRAMs does 
not use interleaving to attain the benefit of higher bandwidth. There are a 
number of reasons such a decision might be made: some technical, and some 
business. 

An interleaved memory system requires each 32-bit memory array to have 
its own dedicated data path to the DRAMs; this is because the tri-state enable/ 
disable times of DRAMs are too slow to attain the desired bandwidth back to 
the CPU. In a banked system, DRAM output enable timing is not a critical 
parameter, and thus multiple banks can share the same data path with no 
performance penalty. 

The other difficulty with interleaved systems comes from a business model. 
Interleaved systems require that the "base configuration” offer a minimum of 
two banks of memory, and that memory upgrades occur in pairs. A non- 
interleaved memory configuration can start with half as much memory in the 
base model, and a memory upgrade of only a single array can be offered. Thus, 
a less expensive base model can be offered, and less expensive upgrades can 
be offered. This may fit a particular marketing requirement of the machine. 


SUMMARY 

DRAMs offer advantages in terms of cost and density of memory. However, 
they also introduce complexity in their control and system interface. The 
R3721 automatically handles this interface between the R3051 and the DRAM 
sub-system, allowing the benefits of DRAMs to be attained with minimal cost 
and complexity to the system designer. 
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INTRODUCTION 

The IDT79R3721 DRAM controller is a single chip DRAM controller for 
systems based on an IDT79R3051 family CPU. It provides all of the control 
timing to interface from the CPU address/data bus and control bus through to 
the DRAM address and control interface, and also provides control of the data 
path between the DRAMs and the CPU. 

The R3721 has been designed to support a wide variety of DRAM sub- 
systems across a wide frequency range of R3051 CPUs. This chapter is 
intended to provide an overview of these capabilities; subsequent chapters 
provide more in-depth details on how these features work, and the specific 
timing associated with various memory configurations. 


R3051 BUS INTERFACE 

The R372 1 is designed to reside directly on the R305 1 family A/D and control 
busses. To complete the system design, an external address decoder is 
required, and external data path chips such as the IDT73720 Bus Exchangers, 
or IDT74FCT245 bi-directional transceivers. 

Regardless of size or organization of DRAM, the R3721 is always connected 
to particular bits of the R3051 A/D bus. The R3721 uses programmed values 
for the DRAM size and configuration to internally multiplex R3051 address 
lines into the appropriate row and column addresses for the DRAM. Table 4.1 
shows the internal multiplexing of addresses performed by the R3721. Table 
4.2 shows the DRAM bank selection, and which RAS/CAS control signals are 
output. 






Address assignment for 256k x1 and 256kx4 DRAMs 


Address assignment for 1Mx1 and 1Mx4 DRAMs 


Address assignment for 4Mx1 and 4M x 4 DRAMs 





















Table 4.1 Processor to DRAM Address Multiplexing 
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Table 4.2. Bank Selection in Multi-Bank System 


The R372 1 monitors the processor ALE, Rd, Wr, and Burst/WrNear control 
signals to determine the type of cycle in progress. The R3721 contains its own 
address latches, and aligns processor address outputs with DRAM Row and 
Column addresses. 

If the external address decoder indicates that this transfer is intended for the 
DRAM sub-system, the R3721 performs the DRAM control interface (using 
timing programmed into the device during system boot). At the appropriate 
time, the DRAM controller will return the RACEn/ACK handshake back to the 
processor to indicate that the transaction is sufficiently completed. 

The interface to ACK and RdCEn is performed using a tri-stateable output 
driver with an internal pull-up. This allows other tri-stateable sources to 
directly drive ACK and RdCEn without introducing combinatorial logic delays 
inherent in combining the acknowledgment of multiple memory subsystems. 


R3721 DRAM INTERFACE 

The R3721 has been designed to interface to a wide variety of DRAM 
subsystems. Various options include: 

¢ Interleaved vs. Non-Interleaved 

Interleaved memory subsystems offer higher system performance by 
providing higher bandwidth to the processor during quad word refills. 
However, an interleaved memory system requires a larger "base" amount 
of memory (two 32-bit arrays minimum) and a wider data path (one for 
each array, time multiplexed onto a single CPU bus). 

The R372 1 offers the system designer the flexibility to design either type 
of memory system. In fact, with proper planning, the system designer can 
offer a base model that does not perform memory interleaving, but allow 
field upgrades to perform interleaving (thus increasing both the memory 
and raw performance of the system). 

e Various densities of DRAM 

The R3721 allows the system designer to use DRAM densities from 
256K x 1 through 4M x 4. Thus, depending on the memory requirements 
of the application, the system designer can decide the appropriate 
memory subsystem for the application. In addition, the DRAM controller 
internally aligns the CPU address bus with the DRAM address lines; this 
allows a later field upgrade to increase the density of memory devices used 
without requiring jumpering of address lines. Table 4.1 shows the 
internal multiplexing of address lines which allows the R3721 to support 
varying densities of DRAMs, without changingits interface to the processor 
bus. 

¢ Single bank or multiple banks of memory 

The R372 1 allows systems to be constructed with one to four banks (32- 
bit wide memory arrays) of memory (either interleaved or not). Obviously, 
it has been designed to allow various strategies of "field upgrades” in the 
DRAM memory sub-system. 

The R3721 utilizes high-performance output drivers, and four sets of 
the RAS and CAS DRAM controls, to directly drive up to 36 DRAM devices 
without external drivers. The R3721 uses a high-power output driver with 
built-in series resistance to avoid the noise problems typically associated 
with driving large capacitive loads. 
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In addition to the capability to directly drive these large loads, the 
R3721 also allows the system designer to incorporate additional, external 
memory drivers if needed. The various timing options can be selected to 
accommodate the additional delay of buffer drivers in the DRAM subsystem. 
The R372 1 takes care of the particular case of partial writes. CAS(3:0) are 
used to provide selective enabling of those DRAMs being written; that is, 
only those byte lanes involved in the write will have their corresponding 
CAS signals asserted. 

e Intelligent Control interface to take advantage of Page Mode DRAMs 

The R3721 state machine was designed after extensive simulation of 
R3051 program behavior. Optimizations around typical locality of 
reference are included in the state machine for the R3051. 

Figure 4.1 shows the basic state machine for the R3721. Note that it is 
optimized for series of page mode DRAM accesses. 
Specifically, page mode is used for: 
— Burst Refill 
Page mode is used to obtain words within a quad word read. 

However, simulation has shown that the most likely next transfer is a 

single word write; thus, RAS and CAS are negated at the end of the burst 

refill to minimize the latency of subsequent operations due to RAS pre- 
charge. 
— Single Reads 

After a single read, the DRAM controller will leave the DRAMs 
expecting a subsequent page mode access to the same page (either 
another read, a write, or a burst refill). The R3721 includes an on-chip 
page comparator which uses the DRAM density programmed into the 
device to determine whether or not a given access can take advantage 
of page mode. 
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Figure 4.1 R3721 State Machine 
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— Single Writes 
After a single write, the DRAM controller will leave the DRAMs 
expecting a subsequent page mode access to the same page (either 
another write, a read, or a burst refill). The DRAM controller can use 
either WrNear, orits internal page comparator, to detect opportunity for 
page mode accesses. 
Thus, the R3721 has truly been optimized to the operating environment of 
the R3051 based systems. 
e¢ Various speeds of DRAMs and Processors 
The R3721 has been designed to support a wide range of processor 
frequencies, across a wide range of DRAM speeds. The system designer 
can configure varying times for the DRAM control signals. Programmable 
DRAM control parameters include: 
— RAS to CAS Delay 
This allows the system designer to control a number of critical 
timings, including row address hold time from RAS and the RAS to CAS 
delay requirements of the system. 
— RAS and CAS pulse widths 
These parameters directly control the access time of the DRAM, and 
the resulting system performance. 
— RAS and CAS pre-charge times 
These parameters allow the system designer to minimize the 
performance penalty of DRAM pre-charge, yet still insure proper 
system operation. 
— Refresh period 
Depending on the system speed, the DRAM controller will be 
programmed for the appropriate counter value to insure both proper 
refresh operation, and to insure that the maximum RAS low time of the 
DRAMisnotviolated. The R372 1 uses a CAS-before-RAS refresh protocol 
to perform DRAM refresh. 
— Address decode time 
The DRAM controller can work in systems which can properly 
decode addresses within the first cycle of a transfer, for optimal 
performance. Alternately, the DRAM controller can work with slower 
systems, requiring an extra half-cycle to perform proper address 
decoding. 
e¢ Various data path options 
The R3721 directly controls the data path between the CPU and the 
DRAM sub-system. The R3721 can control either a set of IDT74FCT245s 
(for non-interleaved memory systems) or IDT73720s (for either multiple 
banked or interleaved memory configurations). 
The R3721 allows this variety of options through the use of the on-chip mode 
register pictured in Figure 4.2. Subsequent chapters will discuss the fields of 
the Mode register, and the impact of the various options on system design. 
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Figure 4.2 Mode Register of DRAM Controller 
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PIN DESCRIPTION 

This section describes the signals used in the above interfaces. More detail 
on the actual use of these pins is found in other chapters. Note that signals 
indicated with an overbar are active low. 


R3051 Interface 


Reset I 

Reset: An active low input used to reset the DRAM controller state 
machines. Reset causes the R3721 to load the mode register with default 
values, and performs 16 CAS-before-RAS refresh cycles to the DRAMs to 
initialize them. 





A/D(25:0) I 
Address /Data(25:0): These signals are connected directly with A/D(25:0) 
of the R3051 family CPU. The DRAM controller uses these inputs to obtain: 
BE(3:0): Individual data byte enables used in write operations. 
Address(25:4): Address bits used to select amongst banks of DRAMs, 
and Row and Column addresses, according to Tables 
4.1 and 4.2. 
Data(15:0): During Mode register write operations, during the 
data phase the A/D bus carmies the values to be 
written into the mode register. 


Addr(3:2) I 

Low Order Address(3:2): These signals carry the word within quad word 
address currently expected by the processor. During single reads, or writes, 
these inputs carry the specified address. During quad word reads, the DRAM 
controller uses an internal counter to manage word within quad word 
addressing, and thus ignores these inputs. 


ALE I 

Address Latch Enable: This signal is used to de-multiplex the A/D bus from 
address to data phase. The R3721 uses this signal to capture the current value 
of A/D(25:0) and Addr(3:2) during the address phase. The R372 1 also uses this 
signal as the indication of the beginning of a memory transfer, and awaits the 
CS input, according to the timing specified in the mode register. 


Rd I 
Read: Indicates that the current transfer is a read (single or burst). 


Wr I 
Write: Indicates that the current transfer is a write (near or not). 


Burst /WrNear I 

Burst: During reads, this signal functions as the "Burst" indicator. If burst 
is asserted during a read, the R3721 knows that a quad word read sequence 
is expected. 

WrNear: During writes, this signal functions as the "Write Near” indicator. 
If the DRAM controller state machine is in the "IDLE, RAS asserted” state, it 
may use this signal to process the write in two cycles. 
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SysClk I 

System Clock: This is the master timing reference, and is a direct 
connection from the SysClk output of the R3051 family processor. All timing 
events are referenced to the SysClk input. 


CS I 

DRAM Chip Select: This input is provided by the external address decoder, 
and is used to indicate that this R3721 controls the DRAM responsible for 
processing this transfer. The R3721 uses the programmed value in the Mode 
Register to determine when to sample this input. 


MSel I 

Mode Register Select: This input is provided by the external address 
decoder, and is used to indicate that this transfer targets the internal mode 
register of the R3721. To write to the mode register, both CS and MSel must 
be asserted by the external address decoder. 


RdCEn O 

Read Buffer Clock Enable: This output to the R3051 processor indicates 
that the currently requested word will be available on its A/D bus at the next 
sampling clock edge (falling edge of SysClk). 

This output is a tri-stateable output; it is only driven by the R3721 in 
transfers in which its CS input is asserted at the proper time. It is internally 
pulled up, so that no external pull-up resistor is required. 


ACK O 

Acknowledge: This output to the R3051 family processor indicates that the 
R3721 has sufficiently processed the current transfer. 

On read operations, the processor uses this information to determine when 
to begin emptying the read buffer into the on-chip cache. The timing of this 
output during quad word reads is determined by the R3721 for optimal 
performance. The R3721 will release the processor to begin execution as early 
as possible in the transfer, but will insure that the fourth word of the quad read 
is available before the processor obtains it from the read buffer. Thus, the 
processor can simultaneously execute the incoming instruction stream even 
while the R3721 obtains the remaining words of the transfer. 

On write operations, the processor uses this to terminate the write operation. 

This output is a tri-stateable output; it is only driven by the R3721 in 
transfers in which its CS input is asserted at the proper time. It is internally 
pulled up, so that no external pull-up resistor is required. 
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DRAM Interface 


DAddr(10:0) » O 

DRAM Address: These outputs are typically connected directly to the DRAM 
multiplexed row/column address inputs. Depending on the memory system 
organization and the organization of the DRAMs used, the R3721 will align the 
processor addresses with the DRAM addresses according to Table 4.1. 

These outputs incorporate series resistors to eliminate overshoot and 
undershoot problems associated with large capacitive loads. In addition, high- 
drive capability has been incorporated in these outputs. Thus, the R3721 can 
directly drive large numbers of DRAMs or multiple SIMM modules. 


RAS(3:0) O 

Row Address Strobe: These outputs are directly connected with the RAS 
inputs of the DRAMs on a bank basis, according to Table 4.2. The falling edge 
of this signal is used by the DRAM to capture the row address presented on 
DAddr(10:0). 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. Each RAS signal may drive multiple 
loads with no system performance degradation. 


CAS(3:0) O 

Column Address Strobe: These outputs are directly connected with the 
CAS inputs of the DRAMs on a byte basis. The R3051 processor may write 
partial word quantities, in which case the R3721 only enables those DRAMs in 
the byte lane being updated. CAS(3) corresponds to BE(3); CAS(2) corresponds 
to BE(2); etc. The falling edge of this signal is used by the DRAM to capture the 
column address presented on DAddr(10:0). 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. However, the propagation delay of CAS 
is a system critical parameter; thus, no CAS signal should drive more than 8 
loads. 


WBank(3:0) O 

Bank Write Enable: These outputs are used to individually control the write 
enables of various memory banks. In non-interleaved systems, all four outputs 
are asserted; RAS selects the specific bank to be written. In interleaved 
systems, they are enabled in pairs; that is, writes to an even bank cause 
WBank(2) and WBank(0) to be asserted, while writes to an odd bank cause 
WBank(3) and WBank(1) to be asserted. Again, only the specific bank being 
written will have its RAS asserted, and thus only that bank will be updated 
during the write. 

During refresh cycles, these outputs are negated. This avoids accessing the 
"test mode” built into modern 4Mb DRAMs. 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. 


OE O 

DRAM Output Enable: This output is directly connected to the output 
enable of common I/O DRAMs. Itis connected to all DRAMs under the control 
of the R3721. 
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Data Path Control Interface 


DByteEn(3:0) O 

Data Path Byte Enable: These outputs are four identical output enables for 
the transceivers in the DRAM data path. Even in the case of partial writes, all 
four enables will be asserted; CAS(3:0) will control which devices actually get 
updated. 

In typical systems, DByteEn is connected on a byte lane basis to evenly 
distribute the load. For example, if the data path interfaces uses 74FCT245s, 
then the DByteEn is directly connected to the "OE" input of the transceiver on 
that byte lane. If the data path uses IDT73720 Bus Exchangers, DByteEn(1:0) 
are connected to the Bus Exchanger on the lower half of the data bus 
(Data(15:0)), and DByteEn(2:0) are connected to the Bus Exchanger on the 
upper half of the data bus (Data(31:16)). 


T/R O 

Transmit/Receive: This signal indicates the direction of the data path, and 
is connected directly to the T/R input of the 74FCT245 or IDT73720. A high 
output indicates that data is being transmitted from the CPU to the memory 
(write); a low output indicates a memory read. 


Path O 

Path: This signal is directly connected to the Path input of the IDT73720. 
It is used to specify the even or odd memory bank participating in the current 
transfer. This output is high if an even bank is the target of the transfer; it is 
low for an odd bank. 


YZLEn O 

Data Path Latch Enable: This signal is connected to the YLEn and the ZLEn 
inputs of the IDT73720 Bus Exchanger. It is used to capture the data provided 
by both banks of memory of an interleaved system, for later sequencing onto 
the processor A/D bus. The latches are transparent when this output is high, 
and closed when it is low. 
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INTRODUCTION 
This chapter describes the organization of the mode register and gives a 
detailed illustration of the timings involved with each mode. Topics include: 
e A general overview of the various fields of the mode register and its 
operation. 
¢ Adetailed description of the timing diagrams related with each field of the 
mode register. 
¢ The default settings of the mode register. 
e A detailed explanation on how to access the mode register. 


THE MODE REGISTER 

The mode register is a 16-bit write-only register used to configure the R372 1 
to adapt it to a variety of different applications. Figure 5.1 illustrates the mode 
register. The settings of the mode register influence the signals used to control 
the external DRAM banks as wellas the signals involved in controlling the data 
path. 
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Figure 5.1 The Mode Register 


15 14 13 12 11 109 8 
| Revd] DOS: RF2 RF1 RFO oP [Revd 





PROGRAMMING THE MODE REGISTER 

The mode register contains different fields that provide the R3721 with great 
flexibility in interfacing with a wide range of applications. Each field is used 
to control one aspect of the behavior of the R3721. All the fields get updated 
when writing to the mode register. 


DRAM Size Field 
Bits O and 1 of the mode register are used to inform the R3721 of the 
organization of the DRAMs used in the system as follows: 
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This allows the R3721 to control up to a maximum of 64 MBytes of memory. 
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External Memory Configuration 
Bit 2 of the mode register are used to program the physical configuration of 
the external memory and the data path. 


Memory ace ee noerereremneresaaiaty 


Non-Interleaved memory | Non-Interleaved memory system ss 


Interleaved memory system and Bus Exchangers are used in the 
data path. 
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Bit 2 
— 


The R3721 always assumes that Bus Exchangers are used in the data path 


for the interleaved configuration. In the Non-Interleaved configurations, it is 
possible to connect either standard transceivers or Bus Exchangers. 









Write Near 

The R3721 has the ability to use the R3051 WrNear output to provide fast 
page mode writes. The extra delay may be appropriate in certain memory 
configurations, as discussed in later chapters. 


Bit 3 
WrNr Use of WrNear 


Oo | Use of WrNear is enabled 
ae Use of WrNear is disabled 


RAS to CAS Delay 

Bit 4 of the mode register specifies the delay between the assertion of the 
appropriate RAS signal to the assertion of the related CAS signal. This delay 
can be programmed to be either one clock cycle or two clock cycles. Figure 5.2 
illustrates the effect of the RCD bit . 

The DRAM controller always transitions the DAddr bus from Row Address 
to Column Address one-half clock cycle before the assertion of CAS. 


Bit 4 

RCD RAS to CAS delay 
lis Oe One clock cycle delay from RAS to CAS 
Two clock cycles delay from RAS to CAS 
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One clock cycle 





RASn 


CASn : One Clock: | 
DAddr Row Addr _, ‘Column Addr | 


Row Addr ( [Column Addr : 








DAddr 


Two Clocks 


Figure 5.2 RAS to CAS Delay 


RAS Timing 

Bits 5, 6 and 7 of the mode register specify the width of the RAS pulse in clock 
cycles as well as the RAS pre-charge time. This field gives the system designer 
the freedom to choose from a wide range of DRAM speeds based on a 
performance/cost criteria. Figure 5.3 illustrates the timings of the RAS signals. 

Logically speaking, RAS precharge occurs in two parts. A portion of the pre- 
charge occurs at the start of the transfer, and varies in duration from one to 
three clock cycles (depending on the programmed value). The second portion 
occurs at the end of the transfer, and is always one clock cycle long. This 
distinction allows the DRAM controller to avoid additional RAS pre-charge if 
the DRAM controller state machine was already in the "IDLE, RAS negated" 
state. 


—— 
ae oe —— 


2 clock | Qclockcycls 2 clock cycles 
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One Cloc 


RASH + Cik| HAS PULSE = 2 Clocks 


aeeees oR RAS PRE-CHARGE = 2 Clocks 


RASn 1 Clk, RAS PULSE = 3 Clocks 
2 Ce __s\_ ore Al Ck RAS PRE-CHARGE = 3 Clk 


PASD 8 Ck (1 Clk 
ee ROCESS eae 


RAS PULSE = 4 Clocks 


AS PRE-CHARGE = 4 Clk 





Figure 5.3 RAS Signals Timing 


CAS Pulse Width 

Bit 8 of the mode register specifies the CAS pulse width in clock cycles. The 
CAS pulse width can be programmed to be 1.5 or 2.5 clock cycles. Figure 5.4 
illustrates the timings of the CAS pulse width. 

The CAS pulse width, along with the CAS precharge time, has the most 
dramatic impact on system performance. These parameters affect the 
performance of the various page mode accesses performed by the DRAM 
controller, and thus directly affect the timing of the RdCEn and ACK 
acknowledgment signals back to the processor. 


CAS Pulse Width | 


oO 2.5 clock cycles 
1.5 clock cycles 


2886 tbl 05 






1 Clock 
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Figure 5.4 CAS Pulse Width Timing 
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CAS Pre-charge Time 

Bit 10 of the mode register specifies the CAS pre-charge time which could be 
programmed to be either 0.5 clock cycle or 1.5 clock cycles. Any combination 
between the CAS pulse width and the CAS pre-charge time is possible. Figure 
5.5 illustrates the CAS pre-charge timing. 


Bit 10 | CAS Precharge 
CP Width 


Fo | 0.5 clock cycle 
1.5 clock cycle 
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One Clock 


1/2 Clock 





1.5 Clocks 






Figure 5.5 CAS Pre-Charge Timing 


Refresh Period 

Bits 11, 12 and 13 of the mode register specify the frequency of the input 
clock to the R3721. The R3721 loads an internal refresh timer with the 
appropriate value to refresh the DRAMs according to the table below. 

The value is appropriate to avoid violating the 10S maximum RAS low time 
specification for DRAMs. 


pees 13 | Bit12| Bit11} SysCik 
RF1 RFO ae 
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Delayed Chip-select 

Bit 14 of the mode register specifies when the R3721 will sample the 
Chip_Select input at the beginning ofany access. The R372 1 can be programmed 
to sample the Chip-Select on the first positive edge of the clock following the 
negation of ALE or on the first negative edge of the clock following the negation 
of ALE. 

This bit allows the R3721 to perform optimally in either a high-performance 
(or low frequency) system capable of rapidly decoding addresses, or in systems 
using a slower, or synchronous approach to address decoding. The R3721 
needs to also be explicitly aware of transfers which do not use its memory 
devices; for example, it can use these cycles to perform a DRAM refresh without 
performance loss in the system. 

The DCS bit also affects the operation of the R372 1 for page writes. Ifthe DCS 
is cleared, the R372 1 can perform page writes ina minimum of two clock cycles. 
If the DCS bit is set, the R3721 can perform page writes in a minimum of 3 clock 
cycles. Figure 5.6 illustrates the timings of the Chip_Select or the Mode_Select 
input pins. 


| CS sampled on the positive edge of the clock 
2 clock cycle page writes may be possible 
CS sampled on the negative edge of the clock after the negation of ALE 
2 clock cycle page writes not possible 
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Figure 5.6 Chip-select Timing 
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DEFAULT SETTINGS 

At power up, the mode register is loaded with default values which 
correspond to the following system: 

¢ DRAM page size: 512 entries 

¢ System configuration: Non-interleaved 

e WrNear for fast writes enabled. 

e 2 clock cycles delay from RAS assertion to CAS assertion 

¢ 4 clock cycles for the RAS pulse width and the RAS pre-charge time 

e 2.5 clock cycles for the CAS pulse width 

¢ 1.5 clock cycle for the CAS pre-charge time 

e 25 MHz frequency of operation 
¢ Delayed Chip_Select. 





Figure 5.7 illustrates the settings of the mode register at power up. 


15 14 =13 12 11 10 9 8 7 6 5 4 3 2 1 0 


D15 D14 D13 D012 D11 D010 D9 D8 






Figure 5.7 Settings of Mode Register at Power Up 


WRITING TO THE MODE REGISTER 

The mode register is a 16-bit write only register that controls the internal 
operation of the R3721 DRAM controller. The different fields of the mode 
register control the behavior of various output control signals such as the RAS 
and the CAS signals. At power up, the mode register is initialized with the 
default settings illustrated in Figure 5.7. To obtain maximum performance out 
of the R3721 DRAM Controller, the mode register needs to be programmed to 
fit the application at hand. 

To access the internal mode register of the R3721, the external address 
decoder must assert both the CS line and the MSel lines. The assertion of the 
CS line is important to distinguish among multiple R372 1's in a single system. 
The Internal mode register of the R372 1 should be mapped in the uncacheable 
I/O space of the R3051. 

The R305 1 can access the mode register by proceeding with a standard write 
operation to the I/O location occupied by the mode register. The R3721 detects 
the assertion of both the CS and the MSel lines and determines that the access 
is for the internal mode register. The data present on the R3051 data bus A/ 
D(15:0) is written into the mode register, regardless of system byte ordering. 
The R3721 returns the ACK signal to the R3051 to terminate the write access 
to the mode register in 3 clock cycles. Thus, the write access to the mode 
register is always 3 clock cycles regardless of the configuration of the external 
memory system. The external state machine controlling the rest of the system 
should not assert the ACK for writes to the mode register, since the R3721 
DRAM Controller asserts ACK with proper timing to terminate the write. Figure 
5.8 illustrates the timing diagrams in writing to the mode register. 

Note that it is recommended that writes to the mode register use '0' in the 
upper A/D bits (A/D(25:16)). This insures compatibility with possible future 
versions of the DRAM controller. When writing to the mode register, the two 
reserved bits (bit 9 and bit 15) must be written as "0". 
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Figure 5.8 Writing to the Mode Register 


AUTO CONFIGURATION DETECTION AND INITIALIZATION 

Many of today's systems are designed to allow for future fields upgrades of 
the base memory system to more memory banks and/or deeper DRAM devices. 
Although these upgrade strategies typically do not support moving from non- 
interleaved to interleaved systems, or from "x4" to "x1" devices, the ability to 
offer a base configuration (at a lower selling price) with capability upgrades is 
often a selling feature of the end product. 

To use the R372 1 in such systems, the software at boot-up should configure 
the mode register of the R372 1 with the maximum memory size it can support 
according to the basic system design. 

The software should then run diagnostics to determine whether or not the 
DRAM size used corresponds to the programmed size. The diagnostic software 
should also determine the presence of multiple banks. Typical strategies for 
such diagnostics include writing distinct values into a given location within 
each bank, and then reading the data back to see if any of the writes did not 
occur properly, or altered data previously written. 

Once the configuration of the system is determined, the software should 
reprogram the mode register with the exact system configuration to obtain the 
maximum performance out of the R3721 DRAM Controller. 
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INTRODUCTION 
This chapter describes the various hardware interfaces of the R3721. 
Included are discussions on: 
¢ The R3051 processor interface, including the interface to the address 
decoder and the interface to other memory controllers. 
¢ The interface to DRAM devices. 
e The interface to the DRAM data path transceiver elements. 


R3051 BUS INTERFACE 

The R372 1 is designed to reside directly on the R3051 family A/D and control 
busses. To complete the system design, an external address decoder is 
required, and external data path chips such as the IDT73720 Bus Exchangers, 
or IDT74FCT245 bi-directional transceivers should be provided. 

Regardless of size or organization of DRAM, the R3721 is always connected 
to specific bits of the R3051 A/D bus. The R3721 uses programmed values for 
the DRAM size and organization to internally multiplex R3051 address lines 
into the appropriate row and column addresses for the DRAM. Chapter 4 
shows the internal address multiplexing of the R3721. 

The R3721 monitors the processor ALE, Rd, Wr and Burst/WrNear control 
signals to determine the type of cycle in progress. The R372 1 contains its own 
address latches, and aligns processor address outputs with DRAM Row and 
Column addresses. 

Figure 6.1 shows the processor interface with the R3721, including the 
interface to the address decoder. 
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R3051 Family R3721 DRAM 
Processor 


A/D(31:26) Address 


aadrta2)}_ | | 





From Other 
Memory Controllers 


Figure 6.1 R3721 CPU Interface Connections 


Processor Interface Signals 

The interface to the processor from the R372 1 in general requires no external 
interface logic. This section describes how the R3721 signals are derived from 
the processor interface. 


Reset I 
In most systems, this signal is directly connected with the same logic used 
to drive the R3051 processor Reset signal. 





A/D(25:0) I 

The R372 1 A/D(25:0) bus is directly connected to the R3051 family A/D bus. 
Regardless of the actual DRAM configuration, the R3721 is always connected 
the same way to the R3051 bus. Although not all systems require the high- 
order address lines, it is good practice to connect all of A/D(25:0) with the 
R3051. This allows greater flexibility in later upgrading to higher density 
DRAMs, or populating with additional DRAM banks. 


Addr(3:2) I 
As with the A/D(25:0), these inputs are directly connected to the R3051 
Addr(3:2) outputs. 


ALE I 
This input is directly connected with the R3051 ALE output. 


Rd I 
This input is directly connected with the R3051 Rd output. 


Decoder __ 
ee oo 


Controller 
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Wr I 
This input is directly connected with the R3051 Wr output. 


Burst /WrNear I 
This input is directly connected with the R3051 Burst/WrNear output. 


SysCik I 

This input is directly connected with the R3051 SysClk output. It is not 
connected through a clock buffer, but rather directly connected with the CPU 
output. 


CS I 

The CS input is provided by the system address decoder to select the DRAM 
address space. In general, the address decoder looks at the R3051 output 
address (it may look at the address as captured by external transparent 
latches) to determine which memory resource is currently being accessed. 


MSei I 

The MSel input is provided by the system address decoder to select the 
R3721 mode register. In general, the address decoder looks at the R3051 
output address (it may look at the address as captured by external transparent 
latches) to determine which memory resource is currently being accessed. 


RdCEn Oo 

The R372 1 RdCEn isa tri-stateable output. It is only driven during accesses 
in which the R3721 CS input is asserted. The connection between this output 
and the R3051 RdCEn input depends on the rest of the system. If the rest of 
the system is designed to provide a tri-stateable RdCEn, then this output can 
be wire "OR"ed with the RdCEN outputs of other memory subsystems, and tied 
directly to the RdCEn input of the processor. Otherwise, a logic device must 
perform the logical negative true "OR" function. An internal pull-up is 
provided. 


ACK O 

The R3721 ACK is a tri-stateable output.. It is only driven during accesses 
in which the R3721 CS input is asserted. The connection between this output 
and the R3051 ACK input depends on the rest of the system. If the rest of the 
system is designed to provide a tri-stateable ACK, then this output can be wire 
"OR"ed with the ACK outputs of other memory subsystems, and tied directly 
to the ACK input of the processor. Otherwise, a logic device must perform the 
"OR" function. An internal pull-up is provided. 


The Address Map 

In typical MIPS-based systems, such as those using the R3051, RAM is 
located in memory starting at physical address "0". Note that various aspects 
of the kernel implicitly assume that RAM will be available at this location; for 
example, the current exception handler is invoked at a very low physical 
address. Thus, typical systems will decode the DRAM accesses in a region 
beginning at physical address "0", and provide a CS to the R3721. 

The Mode Register of the DRAM controller is typically mapped as an I/O 
peripheral. In MIPS systems (and thus for the R3051), these are referenced 
through kseg1, which is unmapped and uncached. References to kseg1 are 
always translated to the lowest 512MB of the physical address space. The 
system architect typically will use an address well above the maximum amount 
of DRAM expected for the system, but avoid the address space reserved for the 
system EPROM (in typical systems, the EPROM is located in the address region 
accessed by the processor Reset exception vector, which is physical address 
1FCO_O000). 
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In systems which use multiple R372 1 DRAM Controllers to manage separated 
DRAM subsystems, the system designer will typically arrange the address map 
so that all of the various DRAM subsystems combine to present a single 
contiguous RAM array as seen by software. That is, one DRAM controller may 
be selected to respond to references in the address range O to 16MB, and the 
next to respond to the address range 16MB to 32MB. Note that there is no 
particular reason for the various DRAM Controllers' mode registers to appear 
contiguous, and thus these are typically scattered throughout the I/O space 
to simplify address decoding. 


R3721 DRAM INTERFACE 

The R3721 has been designed to interface to a wide variety of DRAM 
subsystems. Various options include: 

¢ Interleaved vs. Non-Interleaved 
Various densities of DRAM 
Single bank or multiple banks of memory 
Intelligent Control interface to take advantage of Page Mode DRAMs 
Various speeds of DRAMs and Processors 
Various data path options. 

Later chapters describe specific memory configurations used with the 
R3051. Note that the R3721 provides enough output signals to interface with 
up to 4 memory banks, and to interface with devices as large as 4M x4. Many 
systems will use less than the maximum amount of memory supported. Itis 
typically good practice to route unused address and control lines to the memory 
array, to allow future or field upgrades to higher density devices or additional 
memory banks. 

Figure 6.2 shows the R3721 DRAM Control interface. In general, the 
following strategies for interconnection apply: 


Bank(0) 
R3721 DRAM 
Controller 
RAS(0) 
CAS(3:0) 
DAddr(10:0) 
OE 
WBankEn(0) 


WBankEn(1) 





Figure 6.2 R3721 DRAM Control Interface 
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DAddr{({10:0) O 

These outputs are typically connected directly to the DRAM multiplexed 
row/column address inputs. According to the memory system organization 
and the organization of the DRAMs used, the R3721 will align the processor 
addresses with the DRAM addresses as described in Chapter 4. 

These outputs incorporate series resistors to eliminate overshoot and 
undershoot problems associated with large capacitive loads. In addition, high- 
drive capability has been incorporated in these outputs. Thus, the R3721 can 
directly drive large numbers of DRAMs or multiple SIMM modules. 

Certain system configurations, however, require too many DRAMs to drive 
directly from the R3721. Such systems can use external buffer/drivers, and 
select an appropriate system timing model. 


RAS(3:0) O 

These outputs are typically directly connected with the RAS inputs of the 
DRAMs on a bank basis, as described in Chapter 4. The falling edge of this 
signal is used by the DRAMs to capture the row address presented on 
DAddr(10:0). 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. Each RAS signal may drive multiple 
loads with no system performance degradation. Certain system configurations, 
however, require too many DRAMs to drive directly from the R3721. Such 
systems can use external buffer/drivers, and select an appropriate system 
timing model. 


CAS(3:0) Oo 

These outputs are directly connected with the CAS inputs of the DRAMs on 
a byte basis, according to chapter 4 (CAS(0) corresponds to BE(0), etc.). The 
falling edge of this signal is used by the DRAM to capture the column address 
presented on DAddr(10:0). 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. However, the propagation delay of CAS 
is a system critical parameter; thus, no CAS signal should drive more than 8 
loads. Certain system configurations require too many DRAMs to drive directly 
from the R3721. Such systems can use external buffer/drivers, and select a. 
system timing model appropriately. 


WBank(3:0) 0) 

These outputs are used to individually control the write enables of various 
memory banks. In non-interleaved systems, all four outputs are asserted, and 
RAS is used to control which bank actually is written. In interleaved systems, 
they are enabled in pairs (writes to an even bank cause WBank(2) and 
WBank(0) to be asserted, etc.). Again, only the particular array being written 
will have its RAS asserted. Thus, these outputs are connected directly to the 
WE inputs of the DRAMs in a given bank (that is, WBank(0) is connected to all 
of the WE inputs of memory array 0, etc.). 

During refresh cycles, these outputs are negated. This avoids accessing the 
"test mode” built into modern 4Mb DRAMs. 

In order to directly drive multiple DRAM devices, these signals provide high 
drive, and incorporate series resistors. Certain system configurations, 
however, require too many DRAMs to drive directly from the R3721. Such 
systems can use external buffer/drivers, and select a system timing model 
appropriately. 


OE Oo 

This output is directly connected to the output enable of common I/O 
DRAMs. It is connected to all DRAMs under the control of the R3721. This 
output also incorporates series resistors for driving large loads. 
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DATA PATH CONTROL INTERFACE 

In addition to directly interfacing to the DRAM devices, the R3721 directly 
controls the data path transceivers between the CPU and the DRAMs. The 
R3721 is designed to support the use of 74FCT245 type transceivers in non- 
interleaved configurations, or IDT73720 Bus Exchangers in banked or 
interleaved memory configurations. Figure 6.3 shows the interface between 
the R3721 and 74FCT245 transceivers; Figure 6.4 shows the interface to the 
IDT73720 Bus Exchanger. 

The R372 1 directly provides the control signals for the data path, eliminating 
logic (and timing delays) in this path. Typical systems are connected as 
described below: 


DByteEn(3:0) Oo 

These outputs provide four identical copies of transceiver output enables. 
Note that CAS, which is asserted on a byte basis, controls which DRAMs 
actually participate in the transfer. To equally balance the loads, these outputs 
are typically connected on a byte basis. 

If the data path interface uses 74FCT245's, then the DByteEn is directly 
connected to the "OE” input of the transceiver on that byte lane. Ifthe data path 
uses IDT73720 Bus Exchangers, DByteEn(1:0) are connected to the Bus 
Exchanger on the lower half of the data bus (Data(15:0)), and DByteEn(2:0) are 
connected to the Bus Exchanger on the upper halfof the data bus (Data(31:16)). 


T/R 7 Oo 

This signal indicates the direction of the data path, and is connected directly 
to the T/R input of the 74FCT245 or IDT73720. This output is high during 
writes, and low during reads. 
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Data(31:24) 


Figure 6.3 R3721 Data Path Interface to 74FCT245s 
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Path Oo 

This signal is directly connected to the Path input of the IDT73720. Itis used 
to specify which memory array is participating in the current transfer. The 
R3721 outputs a high to enable an even bank, and a low for an odd bank. 


YZLEn O 

This signal is connected to the YLEn and the ZLEn inputs of the IDT73720 
Bus Exchanger. It is used to capture the data provided by both banks of 
memory of an interleaved system, for later sequencing onto the processor A/ 
D bus. The latch is open when this output is high. 
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Figure 6.4 R3721 Data Path Interface to IDT73720 Bus Exchangers 
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SUMMARY 

The R3721 has been designed to eliminate virtually all glue logic when 
interfacing an R3051 CPU with DRAM memory devices. However, the R3721 
allows the system designer maximum flexibility, by supporting a wide variety 
of memory systems, and by allowing the system architect to construct the 
address map appropriate to the target application. 
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THE USE OF THE R3721 
IN A NON-INTERLEAVED 
MEMORY SYSTEM 





INTRODUCTION 

This chapter describes how to use the R3721 DRAM coniroller in a non- 
interleaved memory system. This chapter explains in detail the effect of various 
configurations of the mode register on the timings of the output signals. The 
design considerations discussed include: 

e¢ A detailed description of the design of a non-interleaved DRAM system. 
A detailed description of the timings for single read transactions. 
A detailed description of the timings for write transactions. 
A detailed description of the timings for quad reads transactions. 


NON-INTERLEAVED SYSTEM DESIGN 

A non-interleaved memory system consists of 1 to 4 banks of “x1” or “x4” 
DRAMs interfaced to the CPU by the DRAM controller. The R3721 uses the 
DRAM size information encoded in the mode register and selects the appropriate 
bank by decoding high-order address bits from the CPU. In the non-interleaved 
configuration, each RAS controls one bank while all the CAS signals are shared 
among the four banks (i.e. RASO and CAS(3:0) control bank 0 and RAS3 and 
CAS(3:0) control bank 3). The CAS signals should be distributed on a byte-lane 
basis; that is, all DRAMs on the byte lane corresponding to BE(0) should use 
CAS(0), etc. In the non-interleaved configuration, the R3721 supports either 
standard 74245 data transceivers or IDT73720 Bus Exchangers in the data 
path. In non-interleaved systems, the output data path control signals from 
the R3721 work identically for either type of data path device. 

For the ease of discussion, all the timing diagrams illustrated in this chapter 
assume the settings of the mode register as shown in Figure 7.1. These settings 
correspond to a 25MHz non-interleaved system with the following: 

e 256Kx4 DRAMs, 

WrNear enabled 

RAS pulse width is 3 clock cycles, RAS pre-charge time is 2 clock cycles, 
CAS pulse width is 1.5 clock cycle, 

CAS pre-charge time is 0.5 clock cycle, 

2 clock cycles delay from RAS to CAS, 

Fast chip-select mode. 
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Figure 7.1 Settings of the Mode Register Used as an Example in this Chapter 
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The timing diagrams illustrated in this chapter apply for single bank as well 
as for multiple bank non-interleaved systems. The Path signal is illustrated 
with two values, and the clock edge at which its value should change is 
indicated to accommodate multiple banks in systems using the Bus Exchanger 
in the data path. For even banks, the Path signal is always 1, for odd banks, 
the Path signal is 0. RASn is any one of the four RAS signals. 


SINGLE READ TRANSACTION TIMINGS 

In general, there are only two types of read transactions from the R3051: 
quad word reads and single word reads. Quad word reads occur only in 
response to cache misses. All instruction cache misses are processed as quad 
word reads; data cache misses may be processed as quad word reads or single 
word reads, depending on the initialization of the processor. Uncached 
references are always processed as single datum reads. This section describes 
the timing diagrams involved in single datum reads; a later section describes 
quad word read operations. The R3721 only asserts the CAS signals 
corresponding to the byte lanes requested for the particular transfer. 


Start of Single Read Access 

The R3721 determines the beginning of a single read access by monitoring 
the assertion of the ALE and the Rd signals from the R3051. The R372 1 latches 
and multiplexes the input address from the R3051 and outputs the row 
address on the DAddr bus. If the fast chip-select mode is selected (DCS bit in 
mode register = 0), the CS input must be valid before the following rising edge 
of SysClk for the R3721 torespond to the access; otherwise the R372 1 assumes 
the access to be outside of the memory space it controls and does not assert 
any control signals. For a slow chip select mode, the CS bit must be valid by 
the following falling edge of SysClk. Figure 7.2 illustrates the beginning of a 
single read access for both cases. 
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Figure 7.2 (a) Start of Single Read Access for Fast Chip-select 
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Figure 7.2 (b) Start of Single Read Access for Slow Chip-select 


Memory Control Signals for Single Read Accesses 

After the detection of the CS signal, the R3721 starts to issue the various 
control signals to the DRAMs in the following way: 

¢ On the rising edge of SysClk following CS, the appropriate RASn signal is 

issued (RASO for access to bank 0, RAS1 for access to bank 1 ....). The ACK 
and RdCEn signals are enabled and driven to a level “high”. 

¢ Depending on the value of the RCD bit in the mode register, the R3721 can 

proceed in two different ways: 

If RCD=0, the column address is presented on the DAddr bus on the falling 
edge of SysClk following the assertion of the RAS signal. The appropriate CAS 
signals are asserted on the next rising edge of SysClk (CASO for BEO, CAS1 for 
BE1...). 
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If RCD=1, the column address is presented on the DAddr bus on the falling 
edge of SysClIk, 1.5 clock cycles following the assertion of the RAS signal. The 
CAS signals are asserted on the following rising edge of SysClk. 

Figures 7.3 (a, b) illustrates the timing diagrams in issuing the control 
signals to the DRAMs for both values of the RCD bit. 
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Figure 7.3 (a) DRAM Control for RCD=0 (Single Read) 
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Figure 7.3 (b) DRAM Control for RCD=1 (Single Read) 


End of a Single Read Access 

Depending on the setting of the CAS pulse width in the mode register, the 
CAS signals are kept asserted for either 1.5 or 2.5 clock cycles, and negated by 
the falling edge of SysClk. The R372 1 is designed in such a way that the R3051 
samples the data on the same falling edge used to negate the CAS signals. Thus, 
the R3721 asserts ACK and the RdCEn one clock cycle before negating the CAS 
signals, so that they are sampled by the processor one-half cycle before the data 
sample point. The ACK and RdCEn signals are asserted on the falling edge of 
SysClk and kept asserted for one clock cycle. 
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To take advantage of the page mode capabilities of the DRAMs, the R3721 
always assumes that any single read access will be followed by another access 
(read or write or quad word read) within the same DRAM page. Based on this 
assumption, the RAS signal remains asserted at the end of the single word read 
access to continue in the page mode of the DRAMs. Figure 7.4(a, b) illustrate 
the timing diagrams in ending a single read access for both values of the CAS 
pulse width. | 
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Figure 7.4 (a) End of Single Read Access, CAS Pulse = 1.5 Clock Cycle 
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Figure 7.4 (b) End of Single Read Access, CAS Pulse = 2.5 Clock Cycle 
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Figure 7.5 illustrates the complete control timings involved in a single read 
access for the settings of the mode register illustrated in Figure 7.1. 
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Figure 7.5 Example of a Single Read Access 


Page Read Accesses 

In order to reduce latency of memory operations, the R3721 attempts to use 
page mode transfers wherever possible. To support this operation, the R3721 
incorporates an internal address comparator which compares high-order bits 
from the current transfer with high-order bits from the previous transfer (the 
previous transfer high-order bits are the current row address of the DRAMs). 
The R3721 determines the maximum page size of the memory system based on 
the DRAM size information encoded in the mode register. 

Page read accesses take advantage of the previous transfer in that the RAS 
signal is already asserted and the DRAM already has accessed the appropriate 
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row address. Page read accesses have timings similar to single read accesses, 
with the exception that no time is lost in re-asserting the RAS signal and re- 
multiplexing the row and column addresses. 

Once the R3721 detects the start ofa single read access from the R3051 and 
determines that it is within the current page, it outputs the column address to 
the DRAMs. On the following rising edge of SysClk, the CAS signals are asserted 
in the fast chip-select mode. In the slow chip-select mode, the CAS signals are 
asserted on the second rising edge of SysClk. The page read access is then 
terminated as for a single read access. Figure 7.6 illustrates the timing for a 
page read access for the settings of the mode register illustrated in Figure 7.1. 
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Figure 7.6 Page Read Access Timing Diagram 
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Single Read Access Outside of Page 

It may occur that the R3721 has left the DRAMs in page mode, but the 
subsequent access it outside of the current DRAM page. In this case, the 
transfer must be completed as a single read transaction. However, RAS must 
be pre-charged before the transfer begins. 

Once the R3721 detects the start ofa single read access from the R3051 and 
determines that it is not within the same page, it outputs the row address to 
the DRAMs. The RAS signal is negated on the second rising edge of SysClk. The 
RAS signal is kept high for the time specified in the mode register (a minimum 
of 2 clock cycles). 

The access continues then as for a single read access: RAS is asserted, the 
column address is presented, the CAS signal asserted, and the response 
generated to the processor. The read access outside of page is then terminated 
as for a single read access. Figure 7.7 illustrates the timing diagrams for a read 
access outside of page for the settings of the mode register as illustrated in 
Figure 7.1. 
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Figure 7.7 Single Read Access Outside of Page 
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SINGLE WRITE TRANSACTION TIMINGS 

In the R3051 family, a significant percentage of the bus traffic is due to 
processor writes to memory. Unlike processor load instructions and instruction 
fetches, which are usually satisfied by the on-chip processor caches and thus 
not seen on the bus, all processor store instructions are seen at the bus 
interface as single write transactions. Note that there is no such thing asa 
“quad word” write; the R3051 performs a word or a subword write as a single 
autonomous bus transaction. However, the R3051 provides a WrNear signal to 
indicate that the present write has the same upper 22 address bits as the 
preceding write, and is used to optimally retire strings of write operations on 
the bus interface. 





Start of Write Access 

The R3721 determines the beginning of a single write access by monitoring 
the assertion of the ALE and the Wr signals from the R3051. The starting 
sequence for a single write access is very similar to the starting sequence ofa 
single read access, which is illustrated in Figures 7.2 (a, b). The WBank(3:0) 
signals are asserted on the falling edge of SysClk after the detection of the Wr 
signal. Innon-interleaved systems, all four WBank signals are identical copies; 
they are typically distributed one per bank to evenly reduce loading. 

Figure 7.8 illustrates the starting sequence for a write access for the fast 
chip-select case. 


Memory Control Signals for Single Write Accesses 

The memory control signals sequence for single write accesses is very similar 
to the single read access sequence described earlier. Specifically, all of the 
discussion with respect to the timing of the RAS and CAS control signals and 
the row and column addresses are identical between reads and writes. 

One difference between reads and writes arises in the case of partial word 
access. Partial writes must be handled specifically so that unaffected bytes 
within the word are not inadvertently written. The R3721 uses the CAS(3:0) 
bus to provide individual byte enables to the DRAMs. These signals are derived 
from the BE(3:0) outputs from the processor. During partial writes, only those 
bytes enabled by the processor have their corresponding DRAM enabled for 
writes, since only those DRAMs see CAS asserted. 

A final consideration in write activity is the availability of the data to the 
DRAMs. To eliminate the penalty typically associated with multiplexed busses, 
the R3051 drives valid data out one half cycle into the transfer. Thus, the write 
data is available early in the transfer, and the R3721 does not need to wait for 
the processor data. 
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End of a Single Write Access 

Terminating a write access is different from terminating a read access, based 
on the bus interface of the CPU. On reads, the R3051 samples data one-half 
clock after it samples the control input asserted; on writes, the R3051 holds 
the write data one full cycle after it samples ACK. Thus, the DRAM controller 
can assert ACK early in the transfer, and be assured that it has a full cycle of 
valid data remaining. 

In a write access, the R3721 asserts the ACK signal half a clock cycle before 
the assertion of the CAS signals. This means that the R3051 terminates the 
write access during the cycle in which the CAS signals are asserted. This 
shortens the initial write latency by one clock cycle compared to the initial read 
latency. In this scheme, the data from the R3051 is guaranteed to be valid for 
one clock cycle after the assertion of CAS since the R3051 doesn’t negate the 
bus until one clock cycle after the detection of the ACK signal. This clock cycle 
is longer than the required data hold time for most DRAMs. The R3721 also 
guarantees that the WBank(3:0) and the DByteEn(3:0) signals are valid and do 
not change for one clock cycle after the assertion of CAS. 

Depending on the encoding of the CAS pulse width in the mode register, the 
CAS signals are kept asserted for 1.5 or 2.5 clock cycles and negated on the 
falling edge of SysCIk. This means that CAS will be kept asserted for 0.5 to 1.5 
clock cycle into the next access. For systems which choose a CAS low time of 
2.5 cycles, there will be no penalty to a subsequent page read or write access, 
even though the CAS signals are kept asserted into the next access. 

Similar to a single read access, the R3721 assumes that any write access 
will be followed by another access (read, write or quad word read) within the 
same DRAM page. Based on this assumption, the RAS signal is kept asserted 
at the end of the single word write access to enable the page mode of the DRAMs. 
Figures 7.9 (a, b) illustrates the timing in ending a single write access for both 
values of the CAS pulse width. 
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Figure 7.9 (b) End of Single Write Access, CAS Pulse = 2.5 Clock Cycle 


Figure 7. 10 illustrates the complete control timings involved in a single write 
access for the settings of the mode register as illustrated in Figure 7.1. By 
comparing Figure 7.10 with Figure 7.5, it can be noted that for the same 
system, the initial write latency is one clock cycle shorter than the initial read 
latency. 
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Figure 7.10 Timing Diagrams for a Single Write Access 
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Page Write Accesses 

The R3051 provides a WrNear signal to indicate that the present write has 
the same upper 22 address bits as the preceding previous write, compatible 
with virtually any DRAM. The R3721 has an internal page comparator that 
determines the true page size based on the DRAMs size encoded in the mode 
register, thus optimizing for the particular memory system. Based on the 
internal page comparator, the R3721 can retire a page mode write in a 
minimum of 3 clock cycles, the same as for a page read access. However, the 
R3721 also uses the WrNear signal from the R3051; when WrNear is asserted, 
the R372 1 can bypass its internal comparator and also the CS input detection, 
and retire the write access in the optimal time of 2 clock cycles. 

The page write access takes advantage of the previous cycle in that the RAS 
signal is already asserted. Page write accesses have similar timing to single 
write accesses, with the exception that no time is lost in re-asserting the RAS 
signal and re-multiplexing the row and column addresses. 

Once the R372 1 detects the start of a single write access from the R3051 and 
determines that it is within the current DRAM page, it outputs the column 
address to the DRAMs. On the following rising edge of SysClIk, the CAS signals 
are asserted in the fast chip-select mode. In the slow chip-select mode, the CAS 
signals are asserted on the second rising edge of SysCIk. The page write access 
is then terminated as for a single write access. 

The R372 1 uses very specific rules to determine whether or not to bypass its 
internal page comparator by using the WrNear signal from the R3051. All ofthe 
following conditions must be satisfied to achieve two cycle writes: 

e settings in the mode register must be as follows: 

- fast chip select is enabled (DCS='0'), 

— CAS pre-charge = 0.5 clock cycle (CP='0’), 

— CAS pulse width = 1.5 clock cycle (C(1:0)='01') 
- WrNear is enabled (WrNr = '0') 

e the previous access was a write access to the memory space controlled by 

the R3721 (CS was asserted during last transfer). 

If both conditions are satisfied, the R3721 ignores the CS input line and 
relies instead on the WrNear signal. If, at the detection of the write access, the 
WrNear signal is not asserted, the R3721 defaults to its standard mode of 
operation and retires a write in a minimum of 3 clock cycles (page mode) or 
longer. If the WrNear signal is asserted along with the Wr signal, the R3721 
asserts the ACK signal on the falling edge of SysCIk and retires the write in two 
clock cycles. This timing is illustrated in Figure 7.11 (a). 
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Figure 7.11 (b) illustrates the timing diagrams for a page write where the CAS 
pulse width is set for 1.5 clock cycles with slow chip-select and CAS pre- 
charge = 0.5 clock cycles. In this case again, the internal page comparator is 
not bypassed and the write access is retired in 3 clock cycles. 
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Figure 7.11 (b) 3 Clock Cycles Page Write with Slow Chip-Select, 
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Figure 7.1 1(c) illustrates the timing diagrams for a page write where the CAS 
pulse width is set for 2.5 clock cycles with slow chip-select and CAS pre- 
charge = 0.5 clock cycles. In this case again, the internal page comparator is 
not bypassed since the system is not set-up for fast chip select, and the write 
access is retired in 3 clock cycles. 

Note that when the CAS pulse width is programmed as 2.5 cycles, slow CS 
must be used. Otherwise, the DRAM write enable will be asserted too soon for 
the DRAMs, resulting in a spurious write cycle. This rule applies regardless of 
the CAS pre-charge width selected. 
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Figure 7.11 (c) 3 Clock Cycles Page Write with Slow Chip-Select, 
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Figure 7.11 (d) illustrates the timing diagrams for a page write where the CAS 
pulse width is set for 1.5 clock cycles and CAS pre-charge = 1.5 clock cycles. 
When CAS pre-charge time is 1.5 clock cycles, back-to-back two cycle near 
writes are not possible, regardless of the setting of the DCS bit. In this case 
again, the internal page comparator is not bypassed and the write access is 
retired in 3 clock cycles. 
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Figure 7.11 (d) 3 Clock Cycles Page Write with CAS Pulse Width = 1.5 Clock Cycles, CAS 
Pre-charge = 1.5 Clock Cycles. 
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Figure 7.11 (e) illustrates the timing diagrams for a page write where the 
WrNear signal is not issued and the R3721 relies on its internal page 
comparator to determine the page size. In this case again, the write access is 
retired in a minimum of 3 clock cycles. This timing will also be exhibited in 
systems which disable the WrNear feature of the processor, via the mode 
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Figure 7.11 (e) Page Write Using Internal Comparator, WrNear Not Asserted 
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Figure 7.12 illustrates the timing diagrams for a page write access for the 
settings of the mode register as illustrated in Figure 7.1. 
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The concept of page write and page read applies for all cases: single reads 
followed by single writes or vice versa. Figure 7.13 illustrates the timing for a 
single read access followed by a single write access followed by a single read 
access, all within the same page and based on the settings of the mode register 
illustrated in Figure 7.1. 
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Figure 7.13 Single Read Followed by a Single Write Followed by a Single Read Access 


Single Write Access Outside of Page 

Single write accesses outside of page are single write accesses from the 
R3051 but happen to be outside the DRAM page accessed by the previous 
single read or single write access. The write access outside of page can’t take 
advantage of the previous cycle, and thus is processed as a standard write. 
However, RAS must be pre-charged prior to the write being processed. The 
single write access outside of page has a very similar timing to the single write 
access with the exception that RAS is pre-charged before re-multiplexing the 
row and column addresses. 
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Once the R372 1 detects the start ofa single write access from the R3051 and 
determines that it is not within the same page, it begins pre-charging RAS. The 
RAS signal is negated on the second rising edge of SysClk. The RAS signal is 
kept high for the time specified in the mode register which is a minimum of 2 
clock cycles. The access continues then as for a single write access by the 
assertion of the RAS signal, presenting the column address, asserting the CAS 
signals and then terminating the access. Figure 7.14 illustrates the timing 
diagrams for a read access outside of page for the settings of the mode register 
illustrated in Figure 7.1. 
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Figure 7.14 Single Write Access Outside of Page 


Partial Word Write Operation 

Partial word write accesses are standard write accesses from the R3051 with 
the exception that only selected bytes within a word are enabled. This 
information is provided by the BE(3:0) signals from the R3051. The R3721 
maps the BE3:0 from the R3051 directly into the CAS(3:0) signals. For partial 
word accesses then, only the CAS signals of the selected bytes will be asserted. 
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QUAD WORD READ TRANSACTION TIMINGS 

Quad word read operations are reads to the memory system in which the 
R3051 reads 4 contiguous words from memory, always starting on an even 
word boundary, and never crossing a DRAM page boundary. Quad word reads 
occur only in response to cache misses. All instruction cache misses are 
processed as quad word reads; data cache misses may be processed as quad 
word reads or single word reads. 


Start of Quad Word Read Access 

The start of a quad word read access is very similar to the start of a single 
read access and is described earlier. The only exception is that the Burst signal 
from the R3051 is asserted at the same time as the Rd signal. 





Memory Control Signals During Quad Word Read Accesses 

The memory control signal sequence for a quad word read access is very 

similar to a single read access in the following way: 

e The first word read from memory is treated exactly as a single read access 
and the timing has been specified earlier. 

e Toread the remaining 3 words from memory, the RAS signal is kept asserted 
while the CAS signals are toggled three extra times. After the first word is 
read, the CAS signals are de-asserted on the falling edge of SysClk (the 
same edge at which the R3051 samples the first word). The CAS signals 
are then asserted on the rising edge of SysClk after satisfying the CAS pre- 
charge requirements encoded in the mode register. The CAS signals are 
kept asserted for the time specified by the CAS pulse width in the mode 
register. They are then de-asserted by the falling edge of SysCIk. This edge 
again corresponds to the edge where the R3051 samples the next word- 
in. This process is repeated for the remaining three words. 

¢ To enable the read buffer of the R3051, for every word available from the 
memory system the RdCEn is asserted for one clock cycle. 

Figure 7.16 illustrates the control signals involved in the quad word read 

transactions. 


End of a Quad Word Read Access 

To terminate a quad word read access, the memory system must return the 
ACK signal back to the R3051. To take advantage of the R3051 instruction 
streaming and to ensure optimal performance, the ACK signal must be asserted 
four clock cycles before the fourth word is sampled by the R3051. The R3721 
makes internal calculations based on the settings of the mode register and 
always asserts the ACK signal four clock cycles before the fourth word is ready. 
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Figure 7.15 (a) Quad Word Read Transaction Timing, CAS Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle 


At the end of quad word read, the R372 1 always negates the RAS signal and 
exits the page mode of the DRAM. This feature has been incorporated because 
simulations have shown thata write outside ofa pageis the most likely transfer 
after a quad word read access. By doing this, the time lost to pre-charge the 
RAS signal is minimized for the next transaction. The RAS signal is always 
negated half a clock cycle after the negation of the CAS signals in any mode or 
configuration. Figure 7.15 (a) illustrates the timing of the control signals 
involved in a quad word read transaction for a CAS pulse width of 1.5 clock 
cycle and CAS pre-charge time of 0.5 clock cycles. 
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Figure 7.15 (b) illustrates the timing of the control signals involved in a quad 
word read transaction for a CAS pulse width of 1.5 clock cycle and CAS pre- 
charge time of 1.5 clock cycle. 
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Figure 7.15 (b) Quad Word Read Transaction Timing, CAS Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle 
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In quad word read transactions, the rate at which the CAS signals are toggled 
determine the speed at which the memory system will return the remaining 
words to the R3051. Figure 7.16 illustrates the complete control timings 
involved in a quad word read access for the settings of the mode register 
illustrated in Figure 7.1. 
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Figure 7.16 Quad Word Read Access Timing Diagrams 
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Page Quad Word Read Accesses 

Page quad word read accesses are quad word read accesses from the R3051 
but happen to be within the same DRAM page as the previous single read or 
single write accesses. 

The page quad word read access takes advantage of the previous cycle in that 
the RAS signal is already asserted for the target row address. The page quad 
word read access has similar timing to the quad word read access, with the 
exception that no timeis lost in re-asserting the RAS signal and re-multiplexing 
the row and column addresses. 

Once the R3721 detects the start of a quad word read access from the R305 1 
and determines that it is within the same page, it outputs the column address 
to the DRAMs. On the following rising edge of SysClk, the CAS signals are 
asserted in the fast chip-select mode. In the slow chip-select mode, the CAS 
signals are asserted on the second rising edge of SysClk. The access proceeds 
then as for a standard quad word read access. Figure 7.17 illustrates the timing 
diagrams for a page quad word read access for the settings of the mode register 
illustrated in Figure 7.1. 
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Figure 7.17 Page Quad Word Read Access Timing Diagrams 
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INTRODUCTION 

This chapter is a detailed example on how to use the R3721 to interface a two 
bank, non-interleaved DRAM memory system to the R3051 RISController 
Family. It will describe the general system implementation and the connections 
between the R3721 and the rest of the system. It will also give a detailed 
explanation of how to set the mode register to adapt the R3721 to the 
application target. Finally, this chapter will summarize some of the timing 
diagrams involved in different types of accesses for the system presented in this 
chapter. 


GENERAL SYSTEM DESCRIPTION 

In a typical system, the R3051 uses a 2x input clock for its internal 
operation and produces a 1x output clock SysClk for use by the external 
system. Figure 8.1 illustrates a general purpose system based on the R3051. 
The system shown is a synchronous one where the memory controllers use the 
SysClk to synchronize their operation to the R3051. The R3721 DRAM 
controller controls two 32-bit banks of non-interleaved DRAMs along with the 
data buffers (74FCT245 or the Bus Exchanger) that go with them. The rest of 
the system (EPROMs and I/O) are controlled by a separate external state 
machine implemented in a couple of programmable logic devices and is beyond 
the scope of this manual. 
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Figure 8.1 General System Using the R3051 and the R3721. 






The R3721 DRAM Controller uses SysClIk directly from the R305 1, while the 
other memory subsystems may use a buffered version of SysCIk to reduce the 
loading effect on the clock line. The R3721 connects directly to the multiplexed 
address/data bus of the R3051. The R3721 uses the ALE signal to latch the 
address of the current access. During writes to the internal mode register of the 
R3721, data is presented on the lower two bytes (A/D(15:0), regardless of 
endianness) of the multiplexed address/data bus and latched by the R3721 
into its internal mode register. For the rest of the external system, standard 
latches such as IDT 74FCT373’s demultiplex the R3051 address and data 
busses. The R3721 shares the control signals from/to the R3051 with the rest 
of the external system. 

An address decoder PAL connects directly to the outputs of the address 
latches and provides the system with the required chip-select lines. The 
address decoder thus provides the R372 1 DRAM Controller with the required 
CS and MSel enable lines. In this example, the R3721 controls two non- 
interleaved banks of 256Kx4 DRAMs that reside between address OX0000_0000 
to OXOO1F_FFFF. The internal mode register of the R3721 resides in the I/O 
space, at address OX0020_0000. The address decoder PAL must generate the 
DRAM_CS line for any access to the DRAM memory space and must issue both 
the DRAM_CS and the MSel lines for a write to the mode register. Figure 8.2 
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illustrates the address decoder PAL equations to produce the DRAM_CS and 
the MSel lines. 


DRAM_CS NOT = {LA31 AND !LA30 AND !LA29 AND !LA28 AND !LA22 AND 
{RD {issue for read} 


OR !LA31 AND LA30 AND !LA29 AND !LA28 AND !LA22 AND 
IWR {issue for writes} 


OR !LA31 AND !LA3SO AND !LA29 AND !LA28 AND LA22 AND 
(WR: {issue for access to Msel} 


MSEL NOT - 1LA31 AND !LA30 AND !LA29 AND !LA28 AND LA22 AND 
(WR; {issue for access to Msel} 


LAxx is the latched address from the address latches 74FCT373’s. 


Figure 8.2 Address Decoder PAL Equations for DRAM_CS and MSel. 


DETAILED DESCRIPTION OF THE R3721 CONNECTIONS 

In this example, the R3721 controls two banks of non-interleaved 256K x 4 
DRAMs to obtain a maximum DRAM memory space of 2 MBytes. Each memory 
bank consists of 8 devices to interface to the R3051 32-bit data bus. Four 
standard data transceivers 74FCT245 in the data path isolate the DRAM banks 
from the R3051 multiplexed address/data bus. This will reduce the loading 
effect on the bus and prevent any contentions from occurring. Figure 8.3 
illustrates the detailed connections among the various modules. 
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Figure 8.3 Detailed Connections for the R3721 in a Two Banks Non-interleaved Memory System 





8-3 


APPLICATION EXAMPLE FOR A NON-INTERLEAVED TWO BANK 
CHAPTER 8 MEMORY SYSTEM USING THE R3721 DRAM CONTROLLER. 





The connections around the R3721 can be divided in several sections as 
follows: 
¢ CPU connections: 

— The R3721 connects directly to the SysClk from the R3051 and 
synchronizes its internal operation to both edges of the clock. 

— The R3721 controls 2 MBytes of DRAMs and requires 21 address lines. 
In this case, the R3721 only needs to connect to A/D(20:0) from the 
R3051, and can have the other unused input lines tied to ground. 
However, it is good practice to connect all the A/D pins of the R372 1 to 
the A/D pins of the R3051(A/D(25:0)). This allows the system to be field 
upgradable to larger densities of DRAMs and/or more banks without 
modifications to the PCB board. 

~ Addr(3:2) from the R3051 are connected directly to Addr(3:2) on the 
R3721. 

— The ALE, Rd, Wr, and Burst/WrNear pins on the R3721 connect 
directly to the corresponding pins on the R3051. 

— ACK and RdCEn are pulled high internally and combined with similar 
signals from the rest of the system to form one set that is routed to the 
R3051. 

— The CS and the MSel are connected to the DRAM_CS and MSel pins 
from the address decoder PAL. 








This set-up is appropriate for multiple banks of "x4" DRAMs, and fora single 
bank of "x1" DRAMs. Multiple banks of "x1" DRAMs require external buffers, 
as described later, but are directly analogous to this system. 

The connections for two banks of "x4" DRAMs should then be as follows: 

— RASO is connected to all the RAS input pins in bank 0 (8 devices). 

— RAS] is connected to all the RAS input pins in bank 1 (8 devices). 

~ CAS(3:0) will be directly mapped from the BE(3:0) outputs from the 
R3051 and are connected to all the CAS input pins in the corresponding 
byte lanes of both banks (4 devices per CAS). 

— RAS2 and RAS3 are not used. 

— OE is connected to all the OE signals of all the DRAMs (16 devices) 

— WBank0O is connected to the WE input pins of all DRAMs in bank 0 (8 
devices). 

—~ WBank] is connected to the WE input pins of all DRAMs in bank 1 (8 
devices). 

— WBank2 and WBank3 are not used 

— DAddr(8:0) will be connected to the 9 input address pins (A8:0) of the 
DRAMs in both banks (16 devices). 











Multiple Banks of "x1" DRAMs 

For multiple banks of “x1” DRAMs, each bank consists of 32 devices for a 32- 
bit data bus. In such topologies, the number of DRAM devices could be as many 
as 128 devices, which is much greater than the drive capacity of the output 
buffers of the R3721. The R3721 is designed to drive a maximum of 36 DRAM 
devices. In the case of very large loads, the use of external buffers or drivers is 
highly recommended. Note that the timing of CAS is a particularly critical 
system parameter. The R3721 is defined for optimal performance when CAS 
drives 8 devices or less; thus, when using multiple banks of "x1" devices, CAS 
should be buffered to reduce loading. The drive capability of the RAS signals 
(RASO to RAS3) is specified for up to 36 devices, but since all the other output 
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control signals will be buffered, the RAS signals should also be buffered (to 
minimize timing skew). 

For systems where only one or two banks of memory are used, the system 
designer should opt for the solution to route the extra unused control outputs 
to unpopulated slots to allow for future field upgrades to denser DRAMs and/ 
or extra memory banks. These signals include the unused RAS and WBank 
signals, and high-order DAddr lines. 

In Chapter 7, it was shown that care must be taken to insure that no 
spurious writes occur during page mode writes. Specifically, note that the 
WBank signal could be asserted on the same clock edge used to negate the CAS 
signals. Most DRAMSs require a Trch of Ons. In most systems, this is usually 
guaranteed, since the WBank line drives a larger load than CAS. However, the 
system designer could choose more design margin by buffering WBank. 

It is also possible to use multiple R3721 DRAM controllers in systems with 
very deep memory requirements. In these systems, each R3721 can control 
upto 64 MBytes. The selection amongst the various R3721 sub-systems is 
performed using the CS inputs from the system address decoder. The use of 
multiple R3721's can also serve to reduce the loading effect on the output 
control signals, and thus reduce memory latency. In a system with multiple 
R3721 subsystems, the address decoder would typically arrange the DRAM to 
appear contiguous in memory (starting at physical address "0"), while the mode 
registers may be scattered throughout the I/O space. 

e Data Path connections: 

In this example of non-interleaved configuration, 74FCT245 data 
transceivers are used as data buffers and the connections are the 
followings: 

— T/Ris connected to the T/R pins of all 4 transceivers (4 devices). 

— DByteEn(3:0) are connected to the output enable (OE) input pins of the 
transceiver of the corresponding byte lane (1 device per DByteEn). 

— Path and YZLEn are not used. 





If the IDT73720 Bus Exchangers were used in the data path, the connections 
should be as follows: 

— T/Ris connected to the T/R pins of both IDT73720 Bus Exchangers (2 
devices). 

— DByteEn(3:0) are connected to the output enable (OE) pin of the half of 
the Bus Exchanger of the corresponding byte lane (1 load each). 

— Path will connect to the Path input pin of both Bus Exchangers (2 
devices). 

- YZLEn will connect to both the LEYX and LEZX pins from each Bus 
Exchangers (total of 4 loads). 


Finally, note that it is recommended that pull-up or pull-down resistors be 
used on the data lines. This will reduce power consumption during partial 
word accesses and idle cycles, when the bus is not being actively driven. 


APPLICATION EXAMPLE FOR A NON-INTERLEAVED TWO BANK 
CHAPTER 8 MEMORY SYSTEM USING THE R3721 DRAM CONTROLLER. 





SETTING THE MODE REGISTER 

In order to obtain the best performance of the R3721 DRAM Controller, the 
internal mode register must be programmed with the appropriate values 
tailored to the application at hand. In the example used in this chapter, the 
system is assumed to be anon-interleaved memory system running at 20 MHz 
using 256K x 4 DRAMs with 100 ns of access time (“trac” = 100 ns). In order to 
determine proper values for the mode register, the system designer must 
consider the AC characteristics of the R3051, the R3721 and the data buffers 
(IDT73720’s or IDT74FCT245’s). In addition, the system designer must calculate 
the derating effect due to capacitive loading on the signal traces. 


Derating Effect Due to Capacitive Loading 

The effect of capacitive loading due to the capacitance of the devices, the 
length of the traces on the PC boards and the propagation delay of the signal 
in travelling through the board add additional delays to the signals. These 
factors collectively are known as derating factors. Derating factors are arrived 
at by making approximate calculations of the capacitance. The capacitance 
obtained is compared with the rated drive capability of the IC component. The 
effect of additional capacitance on the timing is computed based on data sheet 
deratings: 
1. The typical derating factor of the output driver for standard logic devices 

is Ins/50pF. 

2. The derating factor of the output driver for the CPU’s is Ins/25pF. 
3. The traces typically have a capacitance of 2pF/inch. 
4. The signal travels at the speed of 0.2ns/inch on a FR4 subsirate. 


The system designer should consider the derating effects described above 
and should use these or other values appropriate to the specific design in 
question in order to calculate the worst case interface timing. 


The derating delay due to capacitive loading tdr should be computed as 


follows: 
tdr = trace length in inches * 0.2ns/inch + 
[umber of loads * input capacitance per load) - 
( rated capacitive load of the output driver)] * 
the derating factor of the output driver 
tdr = derating delay = __ ns 


In addition, the system designer must consider the variations in time 
between the R3051 output clock high time and output clock low time. This 
variations in the clock tvr are expressed in the R305 1 data sheet by the t32 and 
the t33 parameters and are equal to: 

tvr +2 ns at 25 MHz and less 

tvr + 1 ns at more than 25 MHz. 


Obviously, this effect only needs to be considered for events which occur at 
half-clock cycle intervals; the R3051 guarantees that the period of SysClk will 
be regular rising edge to rising edge or falling edge to falling edge. 
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The analysis to set the mode register should then be as follows: 
¢ DRAM Page Size field (DZ1:0): 

The system designer should set this field depending on the size of the 
DRAMs used in the external system (from 256K x 1 to 4M x4). Inthe case 
of this example, the DRAM size used is 256K x 4 and the DZ1:0 bits are 
set to“O 0”. 

e External memory configuration (Inlvd): 

The system designer has the choice between interleaved and non- 
interleaved configurations and the types of data buffers used for the non- 
interleaved configurations. For this example, 74FCT245 transceivers are 
used in a non-interleaved system and the Inlvd bit is set to “O”. 

e WrNear : 

The WrNr bit in the mode register can be used to force the DRAM 
controller to ignore the processor WrNear output signal during write 
accesses. This feature is important for interleaved systems using DRAM 
SIMM modules, where the OE of the DRAM banks are grounded. In these 
systems, a write to one array will cause the other array to be read; to avoid 
bus contention in consecutive writes, the WrNr bit forces near writes to 
be retired in three cycles (rather than two), thus allowing time for bus 
contention to be avoided. In this system, this is not a problem; the WrNr 
field is set to '0' to allow fast writes. 

e RAS to CAS delay (RCD): 

The RAS to CAS delay is the delay in clock cycles from the assertion of 
a RAS signal to the assertion of the corresponding CAS signal(s). It is 
expressed in clock cycles. This parameter is defined from the “trcd” 
parameter found in the DRAM data sheets. As stated in DRAM data 
sheets, “trcd” is important during read accesses. If the actual RAS to CAS 
delay is less than the max “tred” specified, then the access is controlled by 
the RAS strobe. On the other hand if the actual RAS to CAS delay is greater 
than the max “trcd” specified, then the access is controlled by the CAS 
strobe. Similarly, there are two criteria to consider in deciding on the 
settings of this bit. 

— There is the “Row address hold time “trah”” specified in the DRAM data 
sheet which determines how long the row address must be held 
constant after the assertion of the RAS signal. This parameter is usually 
around 10 to 15ns. If the RCD bit is set to “O”, DAddr will switch from 
the row address to the column address half a cycle after the assertion 
of the RAS signal. At 20MHz, this is equivalent to 25ns. If the RAS signal 
is heavily loaded, violation of this parameter could occur. In that case, 
setting RCD to “1” would be a more prudent choice. 
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~ During single read accesses, or for the initial latency of quad word read 
accesses, if the actual RAS to CAS delay is less than the max “trea” 
specified, then the first word access is controlled by the RAS strobe. The 
system designer must make sure, in that case, that the data will be valid 
when the R3051 samples it. During read accesses, the R3051 samples 
the data in at the same edge the CAS signals are negated. The system 
designer should proceed with the following analysis for RCD set to “O” 
as shown in Figure 8.4: 


txl RAS to CAS delay = 1 clock cycle minimum + 
tx2 CAS pulse width = 1.5 clock cycles minimum 
tx3 total available = 2.5 clock cycles. 

tx4 minimum time available from assertion of RAS 
[tx3 * (1/frequency of operation)] - tvr ; 
[2.5 clock cycles * (1/frequency of operation)] - tvr 





tx5 access time from RAS (“trac” max, DRAM d/s) = ns + 
tx6 delay through data buffers (max, ‘245 d/s) = ns + 
tx7 data setup time for R3051 (t2 max, R3051) = ns + 
tx8 max capacitive derating effect (tdr max) = ns 
tx9 maximum time to obtain data = ns 


for a valid system, tx9 should be less than tx4. 


In this example, RCD has been set to “O”. This corresponds to one clock cycle 
from RAS to CAS. 
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Figure 8.4 Analysis to Set RCD in the Mode Register 


¢ RAS Timing (R2:0) 

The RAS timing field encodes the RAS pulse width as well as the RAS pre- 
charge time. The system designer must set these three bits such that the 
specified RAS pulse width “tras” in the DRAM data sheets and the specified RAS 
pre-charge time “trp” are not violated. In this example, RAS pulse width is set 
to 3 clock cycles which is 150 ns, and is longer than the required 100 ns. The 
RAS pre-charge time is set to 2 clock cycles which is 100 ns and longer than 
the required 70 or 80 ns. R2:0 are then set to“O O 1”. 


¢ CAS pulse width (CO) 

The R3721 is designed in a such a way that during read accesses, the CAS 
signals are negated by the same edge at which the R3051 samples data. 
Further, during most read accesses (single read or quad word reads) the data 
path is assumed to be set to pass the desired data to the CPU (outputs of data 
buffers enabled, data buffers in the receive mode, and the latches are 
transparent). This means that from the CAS strobe (or the RAS strobe) the data 
coming out of the DRAMs passes through the data buffers directly to the 
R305 1. Except for interleaved quad word read accesses, no latching of the data 
takes place. The system designer must ensure that the CAS pulse width is long 
enough for the data to come out of the DRAMs, through the data buffers and 
meet the data setup time of the R3051. 
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The system designer should proceed with the following analysis illustrated 


in figure 8.5: 
tyl = CAS pulse width = 1.5 or 2.5 clock cycles 
tyl' = [CAS pulse width * (1/frequency of operation)] - tvr 

= the timeneeded for the data to be present at the input of the R3051. 

ty2 SysClk to CAS low (tl max, R3721) = ns + 
ty3 access time from CAS (“tcac” max, DRAM d/s) = ns + 
ty4 delay through the data buffer (max, 245 d/s) = ns + 
tyS R3051 data input setup time (tla max, R3051) = ons + 
ty6 max capacitive derating effect (tdr max) = ns 
ty7 max time for data to be ready = ns 


for proper operation, ty7 must be less than tyl'. 


For the example in this chapter, the CAS pulse width is set to 1.5 clock 


cycles, then: 
tyl = 1.5 clock cycles 
tyl' = 
ty2 = 8ns 
ty3 = 25ns 
ty4 = 7ns 
tyS = 6ns 
ty6 = 5ns (estimate) 
ty7 = 51 ns 


[1.5* 50 ns] - 2ns=73ns 


ty7 is less than ty1' and the system should run properly. 
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e CAS pre-charge time (CP): 

Most DRAMs require a CAS pre-charge time of about 10 ns, which is 
equivalent to half a clock cycle. This set up is appropriate for most medium 
speed applications. However, the CAS precharge time is important during the 
page mode operation of the DRAMs. There are two criteria to consider in setting 
this bit: 

— During page read operations where the CAS is pre-charged and then 


re-asserted to enable the next word from the DRAMs as is illustrated in 
figure 8.6 (a) In such situations, the next word to be read from the 
DRAMs will be available after a delay corresponding to : 

- access time from address “taa” or 

- access time from CAS “tcac” or 

- access time from CAS pre-charge “tacp” 


whichever is longer (as per DRAM data sheet). The system designer must 
then take into consideration the access from the CAS pre-charge time. The 
analysis for the access from the assertion of the CAS is the same as for the 
CAS pulse width analysis in figure 8.5. The analysis for the CAS pre- 
charge time is as follows: 


tz1' 


S58 


tz7 


tzl= time from CAS pre-charge to when the next data word must be 
available. This time equals the sum of the CAS pulse width and the CAS 
precharge times with a minimum of 2 clock cycles and a maximum of 
4 clock cycles. 

tz1 of 3 or 4 clock cycles is irrelevant since the access will then 
completely be determined by the CAS pulse width. This analysis will 
concentrate on the 2 clock cycle tz1 where the CAS pre-charge time is 
0.5 clock cycles and the CAS pulse width is 1.5 clock cycles. 


time for next data element 
2 clock cycles * (1/frequency of operation) 


= SysCik to CAS pre-charge (tla max, R3721) = ns + 
= access time from CAS pre-charge 
(“tacp” max, DRAM d/s) = ns + 
= delay through the data buffer (max, ‘245 d/s) = ns + 
= R3051 data input setup time (tla max, R3051) = ns + 
= max capacitive derating effect (tdr max) = ns 
max time for data to be ready = ns 


For proper operation tz7 must be less than tz1' 


For the example of this chapter, the CAS pulse width is set to 1.5 clock cycles 
and the CAS pre-charge time is set to 0.5 clock cycles, then 


tzl 
tz1' 


tz2 
tz3 
tzA 
tz5 
ty6 


ty7 


2.0 clock cycles 
2.0 *50 ns = 100 ns 


8ns 

55 ns 

7 ns 

6ns 

5 ns (estimate) 


81 ns 


ty7 is less than ty1' and the system should run properly. 
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Figure 8.6 (a) CAS Pre-charge Time Analysis 


— During page write accesses where the CAS pulse width is 1.5 clock 
cycles and the CAS pre-charge time is set to 0.5 clock cycles, the system 
designer must ensure that the data is available at the DRAM inputs 
before asserting the CAS strobes. That is the data from the R3051 
through the data buffers in addition to the DRAM data setup time must 
be less than one half clock cycle which is tw6. This timing analysis is 
illustrated in figure 8.6 (b) and is as follows: 


tw6 = one half clock cycle - tvr = ns 
twl = SysClk to data from the R3051 (t19 max, R3051) = ns + 
tw2 = delay through the data buffer (max, ’245 d/s) = ns + 
tw3 = DRAM data setup time (“tds” min, DRAM d/s) = ns + 
tw4 = max capacitive derating effect (tdr max) = ns 
tw5 = max time for data to be ready = ns 


this time (tw5) must be less than tw6 for proper operation. 
For the example of this chapter, the CAS pulse width is set to 1.5 clock cycles 
and the CAS pre-charge time is set to 0.5 clock cycles, then 


twl = 10ns 

tw2 = 7ns 

tw3 = Ons 

tw4 = 5ns (estimate) 
tw5 = 22ns 


tw5 is less than tw6 which is (25 -2 = 23 ns) 23 ns and the system should 
run properly. For well layed-out systems the derating factor could be reduced 
to 3 or 4 ns and thus provide more margin for the DRAM data setup time. 
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Figure 8.6 (b) CAS Pre-charge Time Analysis During Writes 


e Refresh Period (RF2:0): 
The refresh period must be set according to the frequency of operation. In 
this example, the RF2:0 bits are set for 20 MHz operation at “1 O 0”. 


e Delayed Chip Select (DCS): 

The delay chip select must be set if the external address decoder is not fast 
enough to meet the fast chip select requirements. That is, if the external 
decoder can not provide chip select within the first clock cycle of the access. 

The delay chip select feature can also be set to slow down the page write 
accesses. The main reason for this is demonstrated in figure 8.6 (b) Slowing 
down the page writes is appropriate when the delay through the data buffer is 
such that the data is not available to the DRAMs within half a clock cycle. In 
this case setting the DCS bit will slow the page write operation as demonstrated 
in figure 7.12 (c) The slow CS mode must be enabled in systems using a 2.5 
clock cycle CAS pulse width, to insure proper write operation. It will also have 
the effect of adding an extra clock cycle for every access. 

In this example, the data can be available to the DRAMs within the half clock 
cycle (25 ns), and thus the DCS bit is cleared. 
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Figure 8.7 illustrates the settings of the mode register used for this system. 
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Figure 8.7 Mode Register Settings for a Two Bank Non-interleaved System 


SYSTEM TIMING DIAGRAMS 


In general, Chapter 7 illustrated all of the relevant protocol for non- 
interleaved systems. Figures 8.8 and 8.9 are provided to show the exact 
signalling of RAS, which is used to select a particular bank during a particular 
access. 

Figure 8.8 illustrates the timing diagrams involved in a single read access 
to bank 1 starting from an idle, RAS asserted state. Bank 1 is selected by only 
asserting the RAS(1) signal. Note that in this drawing, a RAS precharge is 
required prior to the RAS(1) pulse. 
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Figure 8.8 Single Read Access to Bank(1) 
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Figure 8.9 illustrates the timing diagrams involved in a single write access 
to bank O starting from an idle, RAS asserted state . Again, note that the 
current RAS is for bank 1; thus, a RAS precharge cycle is required. 
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Figure 8.9 Single Write Access to Bank(0) 
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INTRODUCTION 
This chapter describes the use of the R3721 DRAM controller in an 
interleaved memory system. Included in this chapter is a discussion of: 
e The effect of various configurations of the mode register on the timings of 
the output signals. 
e A description of an interleaved DRAM memory system connected to the 
R3051. 
e A detailed illustration of the timing diagrams involved in the various 
processor memory transactions. 


INTERLEAVED SYSTEM DESIGN 

An interleaved memory system consists of 1 or 2 bank-pairs of “x1” or "x4" 
DRAMs connected with the R3051 by the R3721 DRAM controller. Each 
memory bank-pair consists of an even half (32-bit bank) and an odd half (32- 
bit bank). Even halves and odd halves are determined by the low order address 
bits (Addr2 bit), while the bank-pairs are selected by a high order address bit. 
Each half bank-pair in the interleaved configuration is analogous to a bank in 
the non-interleaved configurations. The R372 1 uses the DRAM size information 
encoded in the mode register and selects the appropriate memory devices 
individually by decoding the address bits from the R3051 address/data bus. 
In the interleaved configuration, each RAS controls a single half-bank (i.e. 
RASO controls the even half of bank 0, RAS1 controls the odd half of bank 0, 
RAS2 controls the even half of bank 1, and RAS3 controls the odd half of bank 
1). In the interleaved configuration, the R3721 directly controls the IDT 73720 
Bus Exchangers in the data path. 

The primary benefit of an interleaved memory configuration occurs in quad 
word reads. Interleaved memory does not reduce the latency of DRAM access, 
and thus does not benefit single transactions (single reads, single writes, page 
reads and page writes). In multiple word accesses, however, interleaved 
memory obtains higher bandwidth from the DRAM devices, and thus 
dramatically improves the performance of these accesses. 

For ease of discussion, all timing diagrams illustrated in this chapter 
assume the settings of the mode register as shown in Figure 9.1. These settings 
correspond to an interleaved system with the followings: 

e 1Mx 4 DRAMs, 

RAS pulse width is 3 clock cycles, RAS pre-charge time is 2 clock cycles, 
CAS pulse width is 1.5 clock cycles, 

CAS pre-charge time is 0.5 clock cycles, 

2 clock cycles delay from RAS to CAS, 

Fast chip-select mode, 

WrNear paras 
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Figure 9.1 Settings of the Mode Register Used as an Example in 
this Chapter for Interleaved Memory Systems 
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The timing diagrams illustrated in this chapter apply for single bank-pair or 
two bank-pair interleaved systems. For accesses in the even array of a bank- 
pair, the Path signal is always high; for accesses in the odd array ofa bank pair, 
the Path signal is low. RAS(even) is either RASO or RAS2 while RAS(odd) is 
either RAS1 or RAS3. All four CAS signals are connected to every array in the 
system, on a byte-lane basis. 








SINGLE READ TRANSACTION TIMINGS 

In general, there are only two types of read transactions from the R3051: 
quad word reads and single datum reads. Quad word reads occur only in 
response to cache misses. All instruction cache misses are processed as quad 
word reads; data cache misses may be processed as quad word reads or single 
word reads. Uncached references are always processed as single word reads. 
This section describes the timing diagrams involved in single word reads; a 
later section in this chapter describes the quad word read operations. 


Start of Single Read Access 

The R3721 determines the beginning of a single read access by monitoring 
the assertion of the ALE and the Rd signals from the R3051. The R3721 
multiplexes the input address from the R3051 according to the DRAM 
configuration and outputs the row address on the DAddr bus. If the fast chip- 
select mode is selected (DCS bit in mode register = 0), the CS input must be valid 
before the following rising edge of SysClIk for the R3721 to respond to the 
access; otherwise the R3721 assumes the access to be outside of the memory 
space it controls and does not assert any DRAM control signals. For a slow chip 
select mode, the CS bit must be valid by the following falling edge of SysClk. 
Chapter 7 illustrates the start of a single read. For an interleaved system, the 
R3721 will assert both the even and the odd RAS signals for any single read 
access. The level of the Path signal will direct the proper word from the 
indicated memory array to the R3051. This is important, since after the single 
read access, the page mode of the DRAMs is enabled and the following access 
could be in the even half-bank-pair or the odd half-bank-pair. By asserting 
both RAS control signals, the page mode of both halves of the DRAM bank-pairs 
are enabled for subsequent page mode accesses. 
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Memory Control Signals for Single Read Accesses 

After the detection of the CS signal, the R3721 starts to issue the various 

control signals to the DRAMs in the following way: 

e On the rising edge of SysClk following CS, the appropriate RAS(even) and 
RAS(odd) signals are issued (RASO and RAS1 for access to bank-pair 0, 
RAS2 and RAS3 for access to bank-pair 1). The ACK and RdCEn outputs 
are enabled and driven to a level “high”. 

¢ Depending on the value of the RCD bit in the mode register, the R3721 can 
proceed in two different ways: 











If RCD=0, the column address is presented on the DAddr bus on the falling 
edge of SysClIk following the assertion of the RAS signals. The appropriate 
CAS(3:0) signals are asserted on the next rising edge of SysClk (CASO for access 
to D(7:0), etc). The path signal is set to 1 for accesses to the even half and set 
to O for accesses to the odd half. In the interleaved configurations, accesses to 
the even or odd halves are determined by the Addr2 bit from the R3051. 

If RCD=1, the column address is presented on the DAddr bus on the falling 
edge of SysClk, 1.5 clock cycles following the assertion of the RAS signals. The 
CAS signals are asserted on the following rising edge of SysClk. The Path signal 
is set to 1 for accesses to the even half and set to O for accesses to the odd half. 


End of a Single Read Access 

Depending on the settings of the CAS pulse width in the mode register, the 
CAS signals are kept asserted for 1.5 or 2.5 clock cycles and negated on the 
falling edge of SysCIk. The R3721 is designed in such a way that the R3051 
samples the data on the same falling edge used to negate the CAS signals. The 
R3721 asserts the ACK and RdCEn signals one clock cycle before negating the 
CAS signals. The ACK and RdCEn signals are asserted on the falling edge of 
SysClk and kept asserted for one clock cycle. 

To take advantage of the page mode capabilities of the DRAMs, the R3721 
always assumes that any single read access will be followed by another access 
(read or write or quad word reads) within the same DRAM page. Based on this 
assumption, both the RAS signals (RAS(even) and RAS(odd)) are kept asserted 
at the end of the single word read access to enable the page mode of the DRAMs. 
Chapter 7 illustrates the timing diagrams in ending a single read access for 
both values of the CAS pulse width. 
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Figure 9.2 illustrates the complete control timings involved in a single read 
access for the settings of the mode register illustrated in Figure 9.1. This figure 
represents a generic timing diagram in which the access could be for the even 
or the odd half-bank-pair. This is why the Path signal is shown with both of its 
two possible values. 
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Figure 9.2 Example of a Single Read Access for Interleaved Memory Systems 
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Page Read Accesses 

Page read accesses are single read accesses from the R3051 but happen to 
be within the same DRAM page as the previous single read or single write 
accesses. The R3721 determines the maximum page size of the memory system 
based on the DRAM size information encoded in the mode register. 

The page read access in interleaved memory systems takes advantage of the 
previous cycle in that the RAS signals are already asserted. The page read 
access has very similar timing to the single read access with the exception that 
no time is lost in re-asserting the RAS signals and re-multiplexing the row and 
column addresses. 

Once the R3721 detects the start of a single read access from the R3051 and 
determines that it is within the same page, it outputs the column address to 
the DRAMs. The page read access is then terminated as for a single read access. 
Figure 9.3 illustrates the timing diagrams for a page read access for the settings 
of the mode register illustrated in Figure 9.1. 
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Single Read Access Outside of Page 

Single read accesses outside of the current page are single read accesses 
from the R3051 but happen to be outside the DRAM page accessed by the 
previous single read or single write access. The read access outside of the 
current page doesn’t take advantage of the previous cycle, and thus RAS must 
be pre-charged before the read access is begun. Once RAS is pre-charged, the 
access continues as for a single word read. 

Once the R3721 detects the start ofa single read access from the R3051 and 
determines that it is not within the same page, it outputs the row address to 
the DRAMs. On the second rising edge of SysClk, both RAS signals are negated 
to begin the RAS pre-charge. The RAS signals are kept high for the time 
specified in the mode register (a minimum of 2 clock cycles). The access 
continues then as for a single read access; the RAS signals are asserted, the 
column address is asserted, the CAS signals are asserted, and then the access 
is terminated. The read access outside of page is then terminated as fora single 
read access. Figure 9.4 illustrates the timing diagrams for aread access outside 
of page for the settings of the mode register as illustrated in Figure 9.1. 
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Figure 9.4 Single Read Access Outside of Page for the Interleaved Memory System 
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SINGLE WRITE TRANSACTION TIMINGS 

In the R3051 family, a significant percentage of the bus traffic will be 
processor writes to memory. This is due to the write-through nature of the 
processor data cache: all processor writes are propagated to the bus; however, 
the majority of reads are satisfied by the on-chip caches, and are not seen on 
the bus. 

Note that for the R3051 there is no such thing as a “quad word” write; the 
R3051 performs a word or a subword write as a single autonomous bus 
transaction. However, the R3051 provides a WrNear signal to indicate that the 
present write has the same upper 22 address bits as the preceding write. This 
is equivalent to a DRAM memory page of 256 words, and will work for any of 
the DRAM sizes supported by the R3721. 





Start of Write Access 

The R3721 determines the beginning of a single write access by monitoring 
the assertion of the ALE and the Wr signals from the R3051. The starting 
sequence for a single write access is very similar to the starting sequence ofa 
single read access described earlier. The appropriate WBank(3:0) signals 
(WBank(2) and WBank(0) for writes to the even array; WBank(3) and WBank(1) 
for writes to the odd array. RAS then controls which even or odd array is 
actually written) are asserted on the falling edge of SysClIk after the detection 
of the Wr signal. Again like for a single read access, both the RAS(even) and the 
RAS(odd) signal are asserted for a single write access to enable the page mode 
of the DRAMs in both halves of the interleaved bank-pair. Chapter 7 illustrates 
the starting sequence for a write access for the fast chip-select case. 


Memory Control Signals for Single Write Accesses 

After the detection of the CS signal, the R3721 starts to issue the various 
control signals to the DRAMs in the following way: 

e On the rising edge of SysClk following CS, the appropriate RAS(even) and 
RAS(odd) signals are issued. The ACK and RdCEn outputs are enabled 
and driven to a level “high”. 

e Depending on the value of the RCD bit in the mode register, the R3721 can 
proceed in two different ways: 

If RCD=0, the column address is presented on the DAddr bus on the 
falling edge of SysClk following the assertion of the RAS signals. The 
appropriate CAS signals are asserted on the next rising edge of SysClk. The 
path signal is set to 1 for accesses to the even array, and set to O for 
accesses to the odd array. In the interleaved configurations, Addr2 bit 
from the R3051 determines the access to the even or odd array; to avoid 
writing to the wrong array, only the even or odd WBank signals are asserted. 

If RCD=1, the column address is presented on the DAddr bus on the 
falling edge of SysCIk, 1.5 clock cycles following the assertion of the RAS 
signals. The CAS signals are asserted on the following rising edge of 
SysClk. 

In interleaved configurations, single write accesses differ from single read 
accesses in that only the even or odd pair of WBank signals are asserted. This, 
coupled with the assertion of RAS for the appropriate bank pair, insures that 
only the selected 32-bit array is written into. Finally, only those byte lanes 
being written have their CAS signals asserted, to handle the case of partial word 
writes. This ensures that the data is written in the right memory locations and 
that no wrong data is written in the other half of the memory bank-pair. The 
level of the Path signal will direct the R3051 data through the Bus Exchangers 
to the proper half of the memory bank-pair. 
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End of a Single Write Access 
A single write access in an interleaved system is ended exactly as for a non- 
interleaved system. This operation is described in Chapter 7. 


Page Write Accesses 

The R3051 provides a WrNear signal to indicate that the present write has 
the same upper 22 address bits as the preceding previous write. The R3721 
has an internal page comparator that determines the actual DRAM page size 
according to the information encoded in the mode register. Based on the 
internal page comparator, the R3721 can retire a write in a minimum of 3 clock 
cycles, the same as for a page read access. However, the R3721 also uses the 
WrNear signal from the R3051 to bypass its internal comparator and CS 
detection and to retire a write access in the optimal time of 2 clock cycles. 

The page write access takes advantage of the previous cycle in that the RAS 
signals are already asserted. The page write access has a very similar timing 
to the single write access with the exception that no time is lost in re-asserting 
the RAS signals and re-multiplexing the row and column addresses. 

Once the R372 1 detects the start of a single write access from the R305 1 and 
determines that it is within the same page, it outputs the column address to 
the DRAMs. On the following rising edge of SysClk, the appropriate CAS signals 
are asserted in the fast chip-select mode. In the slow chip-select mode, the 
appropriate CAS signals are asserted on the second rising edge of SysClk. The 
page write access is then terminated as for a single write access. 

The R372 1 uses very specific rules to determine whether or not to bypass its 
internal page comparator and to use the WrNear signal from the R3051. All of 
the following conditions must be satisfied: 

e settings in the mode register are as follows: 

- fast chip_select is enabled (DCS = 0), 

- CAS pre-charge = 0.5 clock cycle (CP = 0), 

- CAS pulse width = 1.5 clock cycle (C1:0 = 01) 
- WrNear must be enabled (WrNr = 0) 

e the previous access was a write access to the memory space controlled by 

the R3721 (CS has been asserted). 











If both conditions are satisfied, the R3721 ignores the CS input line and 
relies totally on the WrNear signal. If at the detection of the write access the 
WrNear signal is not asserted the R3721 defaults to its standard mode of 
operation and retires a write in a minimum of 3 clock cycles. If the WrNear 
signal is asserted along with the Wr signal, the R3721 asserts the ACK signal 
on the falling edge of SysClk and retires the write in two clock cycles. 

With the CAS pulse width selected as 2.5 cycles, the slow CS mode must be 
selected, regardless of the CAS pre-charge selected. This is required to assure 
proper timing during write operations, and to avoid spurious writes. 

In this system, WrNear cannot be used to shorten the access, since the 
R3721 has to satisfy the CAS pulse width of 2.5 clock cycles. In this case the 
CAS is still asserted for 1.5 clock cycles into the next access and the WrNear 
can not be used to do a 2 clock cycles write. The internal page comparator is 
not bypassed and the minimum write access requires 3 clock cycles. 

When CAS pre-charge time is 1.5 clock cycles, the CAS signal will not be 
asserted until the third clock cycle and CS has to be valid by then; thus, the 
delayed chip select has no impact on this access. In this case again, the internal 
page comparator is not bypassed and the write access is retired in 3 clock 
cycles. 
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Figure 9.5 illustrates the timing diagrams for a page write access to the even 
half-bank-pair for the settings of the mode register as illustrated in Figure 9.1. 
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Figure 9.5 Page Write Access Timing Diagrams in Interleaved Systems 
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The concept of page write and page read applies for all cases: single reads 
followed by single writes or vice versa. Figure 9.6 illustrates the timing 
diagrams for a single read access followed by a single write access followed by 
a single read access, all within the same page and based on the settings of the 
mode register illustrated in Figure 9.1. 
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Single Write Access Outside of Page 

Single write accesses outside of page are single write accesses from the 
R3051 but happen to be outside the DRAM page accessed by the previous 
single read or single write access. The write access outside of page can’t take 
advantage of the previous cycle, since RAS must be pre-charged. The single 
write access outside of page has very similar timing to the single write access 
with the exception that extra time is lost in pre-charging the RAS signals before 
re-multiplexing the row and column addresses. 

Figure 9.7 illustrates the timing diagrams for a read access outside of page 
for the settings of the mode register illustrated in Figure 9.1. 


Partial Word Write Operation 

Partial word write accesses are standard write accesses from the R3051 with 
the exception that only selected bytes within a word are enabled. This 
information is provided by the BE(3:0) signals from the R3051. The R3721 
maps the BE(3:0) from the R305 1 directly into the CAS(3:0) signals. For partial 
word write accesses then, only the CAS signals of the selected bytes will be 
asserted. 


‘New Column Address 


! { | 
| | { { 
| | ! I 
| | { | 





Write access outside of page 


Figure 9.7 Single Write Access Outside of Page in Interleaved Systems 
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QUAD WORD READ TRANSACTION TIMINGS 

Quad word read operations are reads to the memory system in which the 
R305 1 reads 4 contiguous words from memory always starting on an even word 
boundary. Quad word reads occur only in response to cache misses. All 
instruction cache misses are processed as quad word reads; data cache misses 
may be processed as quad word reads or single word reads. | 

The advantage of interleaved memory system lies in the multiple word 
transactions, in this case quad word reads. In quad word read accesses, the 
interleaved memory system can produce the remaining three words (after the 
initial latency) at double the rate of the non interleaved systems. This is 
achieved in the interleaved memory system by reading two words from the 
memory at the same time (a word from both arrays of the bank-pair). The 
memory will pass the first word to the CPU while the DRAM controller latches 
the second word into the Bus Exchangers. On the following clock edge it will 
release the latched word to the CPU. Simultaneously, the interleaved memory 
system will pre-charge the CAS signals and produce the remaining two words 
in a similar fashion. This has the effect of doubling the band-width of the 
memory system. 


Start of Quad Word Read Access 

The start of a quad word read access is very similar to the start of a single 
read access and is described earlier. The only exception is that the Burst signal 
from the R3051 is asserted at the same time as the Rd signal. Chapter 7 
illustrates the start of a quad word read access during the fast chip-select 
mode. 
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Memory Control Signals During Quad Word Read Accesses 
After the detection of the CS signal, the R3721 starts to issue the various 
control signals to the DRAMs in the following way: 
¢ On the rising edge of SysClk following CS, the appropriate RAS(even) and 
RAS(odd) signals are issued. The ACK and RdCEn outputs are enabled 
and driven to a level “high”. 
¢ Depending on the value of the RCD bit in the mode register, the R372 1 can 
proceed in two different ways: 


If RCD=0, the column address is presented on the DAddr bus on the falling 
edge of SysClk following the assertion of the RAS signals. The CAS signals are 
asserted on the next rising edge of SysClk. The assertion of CAS produces two 
32-bit words DataO (even) and Datal (odd), one from each array of the bank- 
pair. The path signal is set to 1 to access the even half and the YZLEn signal 
is set to 1 to enable the latches of the Bus Exchangers. 

If RCD=1, the column address is presented on the DAddr bus on the falling 
edge of SysClIk, 1.5 clock cycles following the assertion of the RAS signals. The 
CAS signals are asserted on the following rising edge of SysCIk. The Path signal 
is set to 1 to access the even half-bank-pair and the YZLEn signal is set to 1 
to enable the latches of the Bus Exchangers. 

e After satisfying the CAS pulse width requirement programmed in the 

mode register, the CAS signals are negated on a falling edge of SysCik. The 
CAS signals are negated by the same clock edge the R3051 samples the 
first data element (Data0). 

e At the same clock edge, the CAS signals are negated and the YZLEn signal 
is also negated. This closes the latches of the Bus Exchangers, which now 
store DataO and Datal. 

e Also at this same clock edge, the Path signal is set to 0. This enables the 
data element from the odd half-bank-pair which is stored in the Bus 
Exchangers to be routed to the R3051. 

e One clock cycle later, on the falling edge of the clock, the R3051 will 
sample in the second data element stored in the Bus exchangers (Data 1). 
At this same falling clock edge, Path and YZLEn are reset to level 1. This 
re-enables the even half of the memory bank-pair and makes the latches 
transparent. 

¢ After satisfying the CAS pre-charge time encoded in the mode register, the 
CAS signals are asserted on the rising edge of SysCik. This will again 
produce two data elements (Data2(even) and Data3(odd)), one from each 
half-bank-pair, and the same procedure as described above is repeated 
for a second time. 

¢ To enable the read buffer of the R3051, for each word available from the 
memory system RdCEn is asserted for one clock cycle. In interleaved 
configurations, two data elements are produced at the same time and 
presented to the R3051 on two subsequent clock cycles; thus, the RdCEn 
will usually be kept asserted for two clock cycles. Ifthe system configuration 
is such that the 4 data elements can be presented to the R3051 on every 
subsequent falling edge of SysClk, the RdCEn signal will be kept asserted 
for 4 clock cycles before negated. 
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End of a Quad Word Read Access 

To terminate a quad word read access, the memory system must return the 
ACK signal back to the R3051. To take advantage of R3051 instruction 
streaming and to ensure optimal performance, the ACK signal must be 
asserted four clock cycles before the fourth word is sampled by the R3051. The 
R3721 makes internal calculations based on the settings of the mode register 
and always asserts the ACK signal four clock cycles before the fourth word is 
ready. 

At the end of quad word read, the R3721 always negates both RAS signals 
and exits the page mode of the DRAM. Simulations have shown that the most 
probable transfer after a quad word read is a non-page mode write; thus, the 
R3721 exits page mode to pre-charge RAS, minimizing the access time of the 
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The RAS signals are always negated half a clock cycle after the negation of 
the CAS signals in any mode or configuration. Figure 9.8 (a) illustrates the 
timing of the control signal involved in a quad word read transaction for a CAS 
pulse width of 1.5 clock cycle and CAS pre-charge time of 0.5 clock cycle. 

Figure 9.8 (b) illustrates the timing of the control signals involved in a quad 
word read transaction for a CAS pulse width of 1.5 clock cycle and CAS pre- 
charge time of 1.5 clock cycle. 
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In quad word read transactions, the rate at which the CAS signals are toggled 
determine the speed at which the memory system will return the remaining 3 
words to the R3051. Figure 9.9 illustrates the complete control timings 
involved in a quad word read access for the settings of the mode register 
illustrated in Figure 9.1. 
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Page Quad Word Read Accesses 

Page quad word read accesses are quad word read accesses from the R3051 
but happen to be within the same DRAM page as the previous single read or 
single write accesses. 

The page quad word read access takes advantage of the previous cycle in that 
the RAS signals are already asserted. The page quad word read access has a 
very similar timing to the quad word read access with the exception that no time 
is lost in re-asserting the RAS signals and re-multiplexing the row and column 
addresses. 

Once the R3721 detects the start ofa quad word read access from the R305 1 
and determines that it is within the same page, it outputs the column address 
to the DRAMs. On the following rising edge of SysClk, the CAS signals are 
asserted in the fast chip-select mode. In the slow chip-select mode, the CAS 
signals are asserted on the second rising edge of SysClk. The access proceeds 
then as for a standard quad word read access. Figure 9. 10 illustrates the timing 
diagrams for a page quad word read access for the settings of the mode register 
illustrated in Figure 9.1. 
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INTRODUCTION 


This chapter describes some of the system considerations appropriate in an 
interleaved system. In general, an interleaved system and a non-interleaved 
system are very similar; thus, the reader is referred to earlier chapters. 

This chapter contains: 

e The general system implementation and the connections between the 

R3721 and the rest of the system. 

¢ Adetailed explanation on how to set the mode register to adapt the R3721 

to the application at hand. 

e Asummary of some of the timing diagrams involved for the different types 

of CPU accesses. 


GENERAL SYSTEM DESCRIPTION 

Ina typical system, the R3051 uses a 2x input clock for its internal operation 
and produces a 1x output clock SysClIk for use by the external system. Figure 
10.1 illustrates a general purpose system based on the R3051. The system 
shown is a synchronous one, where the external state machine uses the 
SysClk to synchronize its operation to the R305 1. The R3721 DRAM controller 
controls two bank-pairs of interleaved DRAMs along with two IDT73720 Bus 
Exchangers for the data path. The rest of the system (EPROMs and I/O) are 
controlled by a separate, external state machine implemented in a couple of 
programmable logic devices and is beyond the scope of this manual. 

An address decoder PAL connects directly to the outputs of the address 
latches and provides the system with the required chip-select lines. The 
address decoder also provides the R3721 DRAM Controller with the required 
CS and MSel enable lines. The R3721 controls two interleaved bank-pairs of 
1Mx4 DRAMs that reside between address OXOO0O_O000h and OXOOFF_FFFC. 
The internal mode register of the R3721 resides in the uncached I/O space, at 
physical address 0X0100_0000. The address decoder PAL must generate the 
DRAM_CS line for any access to the DRAM memory space and must issue both 
the DRAM_CS and the MSel lines for a write access to the mode register. 
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Figure 10.1 General Interleaved Memory System Using the R3051, 
the R3721 and the IDT73720 


DETAILED DESCRIPTION OF THE R3721 CONNECTIONS 

The R3721 controls two bank-pairs of interleaved 1M x 4 DRAMs to obtain 
a maximum DRAM memory space of 16 MBytes. Each memory bank-pair 
consists of two 32-bit wide arrays: an even array and an odd array. In the 
interleaved memory configuration, the IDT73720 Bus Exchangers are used in 
the data path to obtain the maximum performance out of the interleaved DRAM 
memory system. Two IDT73720 Bus Exchangers in the data path isolate the 
DRAM banks from the R3051 multiplexed address/data bus. This will reduce 
the loading effect on the bus and prevent contention from occurring. Figure 
10.2 illustrates the detailed connections among the various modules. 

The connections around the R3721 can be divided in several sections as 
described in Chapter 8. In this system, RAS(0) is connected to the even array 
of the first bank-pair; RAS(1) to the odd array; etc. 
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Figure 10.2 Detailed Connections for the R3721 in a Two Bank-pairs Interleaved DRAM Memory System 


10-3 


APPLICATION EXAMPLE FOR AN INTERLEAVED TWO BANK-PAIR 
CHAPTER 10 MEMORY SYSTEM USING THE R3721 DRAM CONTROLLER. 





SETTING THE MODE REGISTER 

In order to obtain the best performance from the R3721 DRAM Controller, 
the internal mode register must be programmed with the appropriate values 
tailored to the application at hand. In the example used in this chapter, the 
system is assumed to be a two memory bank-pairs, interleaved memory system 
running at 25MHz using 1M x 4 DRAMs with 80 ns access time (“trac” = 80 ns). 
The analysis used in this chapter to set the mode register is the same one used 
in Chapter 8. In order to determine the proper values for the mode register, the 
system designer must consider the AC characteristics of the R3051, the R3721 
and the IDT73720. In addition, the system designer must calculate the 
derating effect due to capacitive loading on the signal traces. 


Derating Effect Due to Capacitive Loading 

The effect of capacitive loading due to the capacitance of the devices, the 
length of the traces on the PC boards and the propagation delay of signals 
travelling through the board add additional delays to the signals. These factors 
collectively are known as derating factors. Derating factors are arrived at by 
making approximate calculations of the capacitance. The capacitance obtained 
is compared with the rated drive capability of the IC component. The effect of 
additional capacitance on the timing is computed based on “rules of thumb”: 
1. The derating factor of the output driver for standard logic devices is Ins/ 

50pF. 

2. The derating factor of the output driver for the CPu’s is Ins/25pF. 
3. The traces have a capacitance of 2pF/inch. 
4. The signal travels at the speed of 0.2ns/inch on a FR4 substrate. 


The system designer should consider the derating effects described above 
and should use these or other values appropriate to the specific design in 
question in order to calculate the worst case interface timing. 

The derating delay due to capacitive loading tdr should be computed as 


follows: 
tdr = trace length in inches * 0.2ns/inch + 
[number of loads * input capacitance per load) - 
( rated capacitive load of the output driver)] * 
the derating factor of the output driver 
tdr = derating delay = __ns 


In addition, the system designer must consider the variations in time 
between the R3051 output clock high time and output clock low time. These 
variations in the clock tvr are expressed in the R305 1 data sheet by the t82 and 
the t33 parameters and are equal to: 

tvr = +2ns at 25MHz and less. 

tvry = + Ins at more than 25MHz. 


Obviously, this effect only needs to be considered for events which occur at 


half-clock cycle intervals; the R3051 guarantees that the period of SysClIk will 
be regular rising edge to rising edge or falling edge to falling edge. 
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The analysis to set the mode register should then be as follows: 
¢ DRAM Page Size field (DZ1:0): 

The system designer should set this field depending on the page size of 
the DRAMs used in the external system (from 256K x 1 to 4M x 4). In the 
case of this example, the DRAM size used is 1M x 4 and the DZ1:0 bits are 
set to“l 0”. 

e External memory configuration (Inlvd): 

The system designer has the choice between interleaved and non- 
interleaved configurations and the types of data buffers used for the non- 
interleaved configurations. For this example, interleaved memory system 
is used and two IDT73720 Bus Exchangers must be used in the data path. 
The configuration bit Inlvd is set to “1”. 

e RAS to CAS delay (RCD): 

The RAS to CAS delay is the delay in clock cycles from the assertion of 
a RAS signal to the assertion of the corresponding CAS signal(s). This 
parameter is derived from the “trcd” parameter found in the DRAM data 
sheets. As stated in the DRAM data sheets, “trcd” is important during read 
accesses. If the actual RAS to CAS delay is less than the max “tred” 
specified, than the access is controlled by the RAS strobe. On the other 
hand if the actual RAS to CAS delay is greater than the max “trea” specified, 
then the access is controlled by the CAS strobe. Similarly, there are two 
criteria to consider in deciding on the settings of this bit. 

— There is the “Row address hold time “trah”” specified in the DRAM data 
sheet which determines how long the row address must be held 
constant after the assertion of the RAS signal. This parameter is usually 
10 to 15 ns. If the RCD bit is set to “O”, DAddr will switch from the row 
address to the column address halfa cycle after the assertion of the RAS 
signal. At 25 MHz, this is equivalent to 20 ns. If the RAS signal is heavily 
loaded, violation of this parameter could occur. In that case setting 
RCD to “1” would be a more prudent choice. 

— During single read accesses or for the initial latency of quad word read 
accesses, if the actual RAS to CAS delay is less than the max “trcd” 
specified, then the first word access is controlled by the RAS strobe. The 
system designer must make sure, in that case, that the data will be valid 
when the R3051 samples it. During read accesses, the R3051 samples 
the data in at the same edge the CAS signals are negated. The system 
designer should proceed with the following analysis for RCD set to “0” 
as shown in Figure 10.3: 


txl RAS to CAS delay = 1 clock cycle minimum + 
tx2 CAS pulse width = 1.5 clock cycles minimum 
tx3 total available = 2.5 clock cycles. 


tx4 minimum time available from assertion of RAS 

[tx3 * (1/frequency of operation)] - tvr 
[2.5 clock cycles * (1/frequency of 
operation)] - tvr 


tx5 access time from RAS (“trac” max, DRAM d/s) = ns + 
tx6 delay through IDT73720 (max, d/s) = ns + 
tx7 data setup time for R3051 (t2 max, R3051) = ns + 
tx8 max capacitive derating effect (tdr max) 7 ns 
tx9 maximum time to obtain data = ns 


for a valid system, tx9 should be less than tx4. 
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In this example, RCD has been set to “1”. This corresponds to two clock 
cycles from RAS to CAS. In this case, the data in read accesses is controlled by 
the CAS strobes. 





R305 1 
sample data 
SYSCLK 
CAS(3:0) 
DAddr : Column Address = 
RdCEn 
DRAM Data bus tee DRAM Data 
A/D(31:0) tes 
1 
| = 
ne I 
Rd 
I 
t 


Figure 10.3 Analysis to Set the RCD Bit in the Mode Register 


¢ RAS Timing (R2:0) 

The RAS timing field encodes the RAS pulse width as well as the RAS 
pre-charge time. The system designer must set these three bits such that 
the specified RAS pulse width “tras” in the DRAM data sheets and the 
specified RAS pre-charge time “trp” are not violated. In this example, RAS 
pulse width is set to 3 clock cycles, which is 120 ns and is longer than the 
required 80 ns. The RAS pre-charge time is set to 2 clock cycles which is 
ns and longer than the required 60 to 70 ns. R2:0 are then set to “O O 


e cas pulse width (CO) 

The R3721 is designed in a such a way that during read accesses, the 
CAS signals are negated at the same edge at which the R305 1 samples the 
data. For timing analysis, during read accesses (single read or quad word 
reads) the data path is assumed to be set to the right settings (outputs of 
data buffers enabled, data buffers in the receive mode, and the latches are 
transparent). This means that from the CAS strobe (or the RAS strobe) the 
data coming out of the DRAMs passes through the data buffers directly 
to the R3051. Except for interleaved quad word read accesses, no latching 
of the data takes place. Under these circumstances, the system designer 
must ensure that the CAS pulse width is long enough for the data to come 
out of the DRAMs, through the data buffers and meet the data setup time 
of the R3051. 

The system designer should proceed with the following analysis illustrated 
in Figure 10.4: 
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R3051 
sample data 


SYSCLK 


CAS(3:0) =m. 


DAddr ~_A_ Column Address 


— ar an am ma 
A/D(31:0) a ae a 


are is 










Figure 10.4 CAS Pulse Width Timing Analysis 


tyl = CAS pulse width = 1.5 (2.5) clock cycles 
tyl' = [CAS pulse width * (1/frequency of operation)] - 

= the time needed for the data to be present at the input of the 

R3051. 

ty2 SysClk to CAS low (tl max, R3721) = ons + 
ty3 access time from CAS (“tcac” max, DRAM d/s) = ns + 
ty4 delay through the data buffer (‘max, 245 d/s) = ns + 
tyS R3051 data input setup time (tla max, R3051) = ns + 
ty6 max capacitive derating effect (tdr max) = ns 
ty7 max time for data to be ready = ns 


for proper operation, ty7 must be less than tyl'. 


For the example in this chapter, the CAS pulse width is set to 1.5 clock 


cycles: 
tyl = 1.5 clock cycles 
tyl’ = [1.5*40ns]-2ns=58ns 
ty2 = 7ns 
ty3 = 20ns 
ty4 = 6.5ns 
tyS = Sns 
ty6 = 5ns (estimate) 
ty7 = 43.5ns 


ty7 is less than ty1' and the system should run properly. 


For any configuration, a CAS pulse width of 2.5 clock cycles must also use 
the delayed CS. This is required to avoid spurious writes. 
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e CAS pre-charge time (CP): 

Most DRAMs require a CAS pre-charge time of about 10 ns, which is 
roughly equivalent to half a clock cycle at 25 MHz. This set up is 
appropriate for most medium speed applications. However, the CAS 
pre-charge time is important during the page mode operation of the 
DRAMs. There are two criteria to consider in setting this bit: 
~— During page read operations (page read accesses following page write 

accesses or quad word read accesses) where the CAS is pre-charged 

and then re-asserted to enable the next word from the DRAMs as is 
illustrated in Figure 10.5 (a). In such situations, the next word to be 
read from the DRAMs will be available after a delay corresponding to : 

- access time from address “taa” or 

- access time from CAS “tcac” or 

- access time from CAS pre-charge “tacp” 
whichever is longer (as per DRAM data sheet). The system designer must 
then take into consideration the access from the CAS pre-charge time. The 
analysis for the access from the assertion of the CAS is the same as for the 
CAS pulse width analysis in Figure 10.4. The analysis for the CAS pre- 
charge time is as follows: 

tzl= time from CAS negated to when the next data word must be 

available. This time equals the sum of the CAS pulse width and the CAS 

precharge times with a minimum of 2 clock cycles and a maximum of 

4 clock cycles. 

tzl of 3 or 4 clock cycles is irrelevant since the access will then 

completely be determined by the CAS pulse width. The analysis will 

concentrate on the 2 clock cycle tz1 where the CAS pre-charge time is 

0.5 clock cycles and the CAS pulse width is 1.5 clock cycles. 


R3051 
sample data 


SYSCLK 

| Soom 
CAS(3:0) ia an 
—F 


DAddr Ro Column Address 


RdCEn any, 
—_ Se, 7 : 
A/D(31:0) {daa 


Faas 














Figure 10.5 (a) CAS Pre-charge Time Analysis 
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time for next data element 
2 clock cycles * (1/frequency of operation) 


tz1' 


tz2 = SysCik to CAS pre-charge (tla max, R3721) = ns + 
tz3 = access time from CAS pre-charge 

(“tacp” max, DRAM d/s) = ns + 
tz4 = delay through the data buffer (max, ‘245d/s) = ns + 
tz5 = R3051 data input setup time (tla max, R3051) = — ns + 
tz6 = max capacitive derating effect (tar max) = ns 
tz7 = max time for data to be ready = ns 


For proper operation tz7 must be less than tz1' 


For this example, the CAS pulse width is set to 1.5 clock cycles and the CAS 
pre-charge time is set to 0.5 clock cycles: 


tzl1 = 2.0 clock cycles 
tzl' = 2.0*40ns=80ns 
tz2 = 7ns 

tz3 = 45ns 

tzA = 6.5ns 

tz5 = Sns 

ty6 = 5ns (estimate) 

ty7 = 68.5ns 


ty7 is still less than ty1' and the system should run properly. 


— During page write accesses where the CAS pulse width is 1.5 clock 
cycles and the CAS pre-charge time is 0.5 clock cycles, the system 
designer must ensure that the data is available at the DRAM inputs 
before asserting the CAS strobes. That is, the data from the R3051 
through the data buffers in addition to the DRAM data setup time must 
be less than one half clock cycle which is tw6. This timing analysis is 
illustrated in Figure 10.5 (b) CAS pre-charge time analysis during 
writes is as follows: 


tw6 = one half clock cycle - tvr = ns 

twl = SysClik to data from the R3051 (t19 max, R3051) = ns + 
tw2 = delay through the data buffer (max, ’245 d/s) = ns + 
tw3 = DRAM data setup time (“tds” min, DRAM d/s) = ns + 
tw4 = max capacitive derating effect(tdr max) = ns 

twS = max time for data to be ready = ns 


this time (tw5) must be less than tw6 for the proper operation of the system. 
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DRAM strobes 
data in 
SYSCLK 
RASn 
CAS(3:0) = 7 
DAddr Column ) i Column Address 
om fe 
DRAM 
Data bus = DRAM Data 
worst) ——1< naar) 
ALE 
Wr 


Figure 10.5 (b) CAS Pre-charge Time Analysis During Writes 


For this example, the CAS pulse width is set to 1.5 clock cycles and the CAS 
pre-charge time is set to 0.5 clock cycles: 


twl 
tw2 
tw3 
tw4 


tw 


9ns 
6.5 ns 
Ons 
5 ns (estimate) 


20.5 ns 


tw5 is greater than tw6 which is 18 ns (20-2 = 18 ns) and the CAS pre-charge 
time should be set to 1.5 clock cycles. However, the CAS pre-charge time will 
be set to 0.5 clock cycles in order not to slow down the quad word accesses. To 
ensure proper operation during write accesses, the WrNr bit in the mode 
register will be set to 1. In this case, page write accesses will take a minimum 
of 3 clock cycles and no violation of the DRAM specification will occur. 

A final consideration for setting the CAS precharge time stems from the 
particular nature of interleaved systems. Specifically, if the CAS pulse width 
selected is 2.5 clock cycles, then the system must also use the slow CS mode. 
This is required to avoid bus contention when switching between reads and 


writes. 
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¢ Refresh Period (RF2:0): 

The refresh period must be set according to the frequency of operation. 
In this example, the RF2:0 bits are set to a 25 MHz operation at “1 O 1” 

¢ Delayed Chip Select (DCS): 

The delay chip select must be set if the external address decoder is not 
fast enough to meet the fast chip select requirements. This is if the 
external decoder can not provide chip select within the first clock cycle of 
the access. In this example, where the system is running at 25 MHz, for 
fast chip select operations the DRAM_CS line must be ready within 40 ns. 
If the DCS bit is set to “1”, the DRAM _CS needs to be valid within 60 ns, 
which is easily achievable. 

The delay chip select can also be set to slow down the page write accesses. 
Slowing down the page writes accesses is appropriate when the delay through 
the data buffer is such that the data is not available to the DRAM within half 
a clock cycle. In this case, setting the DCS bit will slow the page write operation 
as demonstrated in Figure 9.12 (c). It will also have the effect of adding an extra 
clock cycle for the initial latency during page read accesses (quad word reads 
included) while keeping the repetition rate (remaining words in a page) at its 
peak. 

In the example used in this chapter, the DCS bit is set to “O” since there is 
no need of slowing down the page write accesses. Figure 10.6 illustrates the 
settings of the mode register used for this system. 

¢ Ignore WrNear: 

The WrNr bit in the mode register can be used to force the DRAM 
controller to ignore the processor WrNear output during write accesses. 
This feature is important for interleaved systems using DRAM SIMM 
modules, where the OE of the DRAMs is grounded; if OE is available from 
the SIMM, higher performance is possible by enabling the WrNear from 
the CPU. In systems with no OE control, a write to one array will cause 
a read to the other array in the bank-pair; to avoid bus contention in 
consecutive writes, the WrNr bit forces near writes to be retired in a 
minimum of three cycles (rather than two), thus allowing time to avoid bus 
contention. In this system, WrNr is set to '1' to slow writes and avoid bus 
contention. 


SYSTEM TIMING DIAGRAMS 

The following section will present different timing diagrams for bus accesses 
based on the system described earlier in this chapter. These timing diagrams 
illustrate the various CPU accesses possible, and are provided to illustrate the 
complete functionality of the R3721 DRAM Controller. 


10 9 8 7 6 5 


eth bebe PEEP 


D1i5 Di4D13 Di2011D10 D9 D8 D7 Dé DS D4 D3 D2 Di O\DO 






Figure 10.6 Mode Register Settings for a Two Bank Non-interleaved System 
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Figure 10.7 illustrates the timing diagrams involved in a single read access 
to the even memory array of bank-pair 0 starting from an idle state where no 





RAS signal was asserted. 
R3051 
sample data 
ne PAL 
ALE . = /__\ 
a SaaS 
= 
A/D 31:0 ) € Data > 
‘s A cae 
eS 
me a 
on st 8 
me) dd he 
we i 
i ea es 
DAddr(10:0) |X| Row Adress) X] Coan Aeros |) 
nd eee 
oe = 
Bie a 
RdCEn 
oj 
2 [a ee ee Se 
ne | | 
| — 
ee 
WBank(3:0) 4 | | ! ! tts 
. ue | | | 
nn 
DRAM XX DRAM Data__—i?) 
data bus ! ! ! ! 0.9, = _ 
Idle state single read access the even half-bank-pair 0 Idle, RAS 


asserted state 


Figure 10.7 Single Read Access to the Even Half-bank-pair 0 
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sysClk 


ALE 


a 


RASS 
CAS(3:0) 


DAddr(10:0) 


ACK 

RdCEn 

T/R 

Path 

YZLEn 
WBank(3:0) 
DByteEn(3:0) 
OE 


DRAM 
data bus 


idle, RAS 
asserted state 





Figure 10.8 illustrates the timing diagrams involved in a single read access 
to the odd memory array of bank-pair 1 starting from an idle, RAS asserted state. 
Pre-charging RAS has to occur because the previous access occurred to the 
other bank-pair. 


oes Oo Oe 


Fe 

tCAS 

| 3 ¥ 
| oo; 

' Row Address | ¥ 1 Column Address as 


i pee eee eer t12 
Sept 
t10 





: = | 
ihe see eee ee ee ee = mm = aie ane a ae ae 


<XX DRAM Data 
| 


SS 
single read access the odd half-bank-pair 1 Idle, RAS 


asserted state 


Figure 10.8 Single Read Access to the Odd Half-bank-pair 1 
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Figure 10.9 illustrates the timing diagrams involved in a single write access 
to the odd memory array of bank-pair 0, starting from an idle state where no 


RAS signal was asserted. 


SysClk 


ALE 


Wr 


WrNear 


A/D 31:0 


CS 


RASO 


RAS1 


CAS(3:0) 
RAS2 


RASS 
DAddr(10:0) 


ACK 

RdCEn 

T/R 

Path 

YZLEn 
WBank(3, 1) 
DByteEn(3:0) 
OE 


DRAM 
data bus 


Idle state 


| = Row Address =e! 


pee : 
ae 


ge lana doakinbeha 
See 


tRCD 


ate! Address 
t11 ates t12 


ace : 
pot tT Yt 


Data Data tj DRAM DRAM 


Single write access to the odd half-bank-pair 0 Idle, RAS 


asserted state 


Figure 10.9 Single Write Access to the Odd Half-bank-pair 0 
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SysClk 
ALE 

Wr 
WrNear 
A/D 31:0 
CS 
RASO 


RAS1 


RAS2 
RAS3 


CAS(3:0) 
DAdadr(10:0) 


ACK 
RdCEn 
T/R 
Path 
YZLEn 


WBank(2,0) 


DByteEn(3:0) 





Figure 10.10 illustrates the timing diagrams involved ina single write access 
to the even memory array of bank-pair 1, starting from an idle, RAS asserted 
state. 





ae es nen eee 
Addr Xi Rau ES 
18a : 
a a 
oe ee 


Ce 
aa 
|X 1 RowAddress | | + | % =~ Column Address 
Seles ait Pee een: seen es {3 ! t12 


ae | Tt | ee 








x Pod EEE 
an i | | | , K__Datatd DRAM __|_D 
Idle, RAS Single write access to the even half-bank-pair 1 Idle, RAS 


asserted state 


asserted state 


Figure 10.10 Single Write Access to the Even Half-bank-pair 1 





10-15 


APPLICATION EXAMPLE FOR AN INTERLEAVED TWO BANK-PAIR 
CHAPTER 10 MEMORY SYSTEM USING THE R3721 DRAM CONTROLLER. 


SEES ee OO EL I OE OR RA CT a RE RT TE HL EE I a EE I EDIT IEE TTT TE I a a ART ESTEE OO ART SE REE 


10-16 





RESET INITIALIZATION, 
REFRESH AND INPUT 
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INTRODUCTION 
This chapter discusses the system housekeeping for the R3721: 
e The reset initialization sequence performed by the R372 1 DRAM controller 
to initialize the DRAM memory banks under its control. 
¢ The CAS-before-RAS refresh sequence used by the R3721. 
¢ The input clocking requirements of the R3721. 


POWER-UP AND RESET 

The R3721 uses the same Reset pulse as the R3051 in order to synchronize 
its operation to the R3051. The R372 1 has the same requirements as the R305 1 
in terms of the power on reset pulse width, and the warm reset pulse width. 
Figure 11.1 illustrates the power on requirements of the R3721 DRAM 


Controller. 
eee 
Vcc 
SysClk PLL 
t23 
Reset 


Figure 11.1 Cold Start 


Figure 11.2 illustrates the warm reset requirements of the R3721 DRAM 
Controller. 


{24 
Reset , 


Figure 11.2 Warm Reset 


DRAM INITIALIZATION 

Reset causes the internal mode register of the R3721 to be loaded with the 
default values (illustrated in Chapter 4), and all the output control signals are 
negated. All the internal counters are cleared. The internal refresh timer is 
loaded with the refresh interval count that corresponds to the default settings 
of the mode register. 

The R3721 DRAM Controller proceeds to initialize the complete memory 
system by issuing 15 consecutive CAS-before-RAS refresh cycles. These 16 
refresh cycles reset the internal row counter of the DRAMs. Figure 11.3 
illustrates the reset initialization sequence of the R3721. 
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SysClk 
Reset 
RASn 

CAS(3:0) 
DAddr(10:0) 
Remaining 


output control 
signals 





REPEAT 15 TIMES CAS BEFORE RAS REFRESH CYCLES 


Figure 11.3 Reset Initialization Sequence 


During the reset initialization sequence, the R372 1 uses the default settings 
of the mode register to control the widths of both the RAS and the CAS strobes. 
The default values of the mode register call for the largest RAS pulse width and 
pre-charge time. This ensures that during reset initialization, the specified 
DRAM parameters (RAS pulse width, CAS pulse width, ....) are not violated. 


CAS BEFORE RAS REFRESH TIMINGS 

The R3721 has a built-in refresh timer that issues a refresh request at a 
maximum interval time of 9.6 usec. The refresh timer gets loaded with the 
appropriate number of clock counts that are encoded in the refresh field of the 
mode register. 

The refresh interval of 9.6 usec maximum, ensures that the maximum 
specified RAS pulse width of 10 psec (as per DRAM data sheets) is never 
violated. This feature is very important since in page mode the RAS signal can 
be kept asserted for long periods of time. 

Figure 11.4 illustrates the timing for a CAS-before-RAS refresh sequence. 
During the refresh sequence all the DRAM control signals are negated with the 
exception of the RAS and the CAS signals. This ensures that, for 4 Mbit DRAMs, 
the WBank(3:0) signals are not asserted during the refresh, and thus the test 
mode of 4 Mbits DRAMs is not enabled. At the end of a refresh sequence or 
initialization sequence, all the control signals are negated. During a refresh 
sequence, the R3721 asserts all the RAS and all the CAS signals as a single set. 
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Figure 11.4 CAS-before-RAS Refresh Sequence 
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¢ Priority Scheme 
To resolve conflicts between an internal refresh request and external 
bus accesses requests, the R3721 has built-in the following refresh 
priority scheme: 

~ If ALE and CS are detected at the same time as an internal refresh 
request, the R372 1 gives the priority to the refresh request and services 
it. At the same time, the R3721 registers the fact that a transfer is 
pending and will service it at the end of the refresh sequence. 

— Ifarefresh request occurs during the time the R372 1 is servicing a bus 
access, the refresh sequence will be delayed until the end of the bus 
access. 

— If a transfer is detected (ALE and CS asserted) during the time the 
R372 1 is servicing a refresh request, the bus access is delayed until the 
end of the refresh sequence. Additionally, the R3721 will enable its 
RdCen and ACK output drivers. 


INPUT CLOCK REQUIREMENTS 
The R3721 uses the SysClk output directly from the R3051 to synchronize 
its operation to the R3051 timing requirements. The R3721 uses both edges of 


the SysClk to control its internal state machine. The requirements for the input 
clock to the R3721 are illustrated in Figure 11.5. 


5 


SysCik 
6 


Figure 11.5 R3721 Input Clock Requirements 
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INTRODUCTION: 

The IDT73720 Bus Exchanger is designed to interface multiple memory 
busses to a single CPU data bus. It is used in systems which implement 
multiple banks within a memory subsystem—either interleaved, or banked for 
deeper memories. 

This appendix provides an overview of the IDT73720 16-bit Bus Exchanger, 
shown in Figure A.1. Detailed information on the pin-out, packaging, and 
electrical specifications can be found in the data sheet for this device, also 
available from IDT. 


MAJOR FEATURES: 
¢ High speed 16-bit bus exchange for interbus communication in the 

following environments: 

— Multi-way interleaving memory 

— Multiplexed address and data busses 

¢ Direct Interface to R3051 Family RISChipSet™ 

— R3051™ Family of Integrated RISController™ CPUs 
—R3721 DRAM Controller 
Supports R3051 family systems from 20 to 40MHz 
Interfaces a single CPU bus to interleaved memory systems 
Data path for read and write operations 
Low noise 12mA TTL level outputs 
Simplifies data path design in high-performance memory systems 
Bidirectional 3-Bus Architecture: X, Y, Z 
— One CPU Bus: X 
— Two (interleaved or banked) memory busses: Y & Z 
— Each bus can be independently latched 
Byte control on all three busses 
Source terminated outputs for low noise and undershoot control 
68-pin PLCC package 
High-performance CMOS technology 


a Z "| Y-WRITE [_ 
ae Latch an 
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Xo ++ = oe it 
X 5-15 41 any : Bus Control _ 
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Figure A.l Block Diagram of IDT73720 Bus Exchanger 
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DESCRIPTION: 

The IDT73720 Bus Exchanger is a high speed 16-bit bus exchange device 
intended for inter-bus communication in multi-way interleaving or multi- 
banked memory systems. 

The 73720 Bus Exchanger provides data path support in an R3051 family 
system utilizing interleaved or banked memory techniques. The Bus Exchanger 
is responsible for interfacing between the CPU A/D bus (CPU address/data 
bus) and multiple memory data busses. The R372 1 DRAM Controller has been 
designed to directly control a pair of Bus Exchangers as the data path between 
the CPU bus and DRAM memory busses. 

The 73720 uses a three bus architecture (X, Y, Z), with control signals 
suitable for simple transfer between the CPU bus (X) and either memory bus 
(Y or Z). The Bus Exchanger features independent read and write latches for 
each memory bus, thus supporting a variety of memory strategies. Y and Z 
ports support individual byte output enables to independently enable upper 
and lower bytes. 


ARCHITECTURE OVERVIEW: 

The Bus Exchanger is used to service both read and write operations 
between the CPU and the dual memory busses. It includes independent data 
path elements for reads from and writes to each of the memory banks (Y and 
Z). Data flow control is managed by a simple set of control signals, analogous 
to a simple transceiver. In short, the Bus Exchanger allows bidirectional 
communication between ports X and Y and ports X and Z. 

The data path elements for each memory port include: 

Read Latch: Each of the memory ports Y and Z contains a transparent latch 
to capture the contents of the memory bus. Each latch features an independent 
latch enable. During reads, the R3721 will assert the YZLEN output (tied to the 
LEYX and LEZX inputs of the 73720) to cause the bus exchanger to capture the 
data from the interleaved memory array. 

Write Latch: Each memory port Y and Z contains an independent latch to 
capture data from the CPU bus during writes. Each memory port write latch 
features an independent latch enable, allowing write data to be directed toa 
specific memory port without disrupting the other memory port. The R3721 
uses the write data path as a simple data transceiver, and thus does not need 
to latch data into either of the write latches. 


DATA FLOW CONTROL SIGNALS 

T/R (Transmit/Receive). This signal controls the direction of data transfer. 
A transmit is used for CPU writes, and a receive is used forread operations. The 
R3721 T/R output has been designed to drive either a pair of Bus Exchangers, 
or 4 IDT 74FCT245s. 

Path: The path control signal is used to select between the even memory 
path Y and the odd memory path Z duringread or write operations. Path selects 
the memory port to be connected to the CPU bus (X-port). The R3721 usesa 
Path value of "1" for the even memory port, and "0" for the odd port. 

In an interleaved memory system, data is captured into the Bus Exchanger 
using the R3721 YZLEn; the R3721 then uses the Path signal to sequence the 
even read latch followed by the odd read latch onto the CPU port. Thus, both 
words are returned to the processor in back to back cycles. 

OEH, OEL are the output enable control signals to select upper or lower 
bytes of all three ports. These signals, in conjunction with T/R and Path, 
determine the current output ports of the Bus Exchanger. 
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MEMORY READ OPERATIONS | 

Memory reads can be thought of as occurring in two distinct stages. During 
the first stage, the data present at the memory portis captured by the read latch 
for that memory port. During a subsequent phase, data is brought from a 
selected read latch to the CPU A/D port X by using output enable control. 

The read operation is selected by driving T/Rlow. The read is managed using 
the Path input to select the memory port (Y or Z); the LEYX/LEZX enable the 
data capture into the corresponding Read Latch. 

The read latches enable the R3721 to perform high-performance bursts in 
interleaved memory systems. The R3721 reads both banks simultaneously; 
once the data is available, the R3721 closes the Bus Exchangers ' read latches, 
capturing the data. The DRAM Controller then sequences the data onto the 
CPU bus, while simultaneously pre-charging CAS, and re-asserting CAS to 
obtain the second pair of words. In many systems, this strategy allows the 
DRAM controller to return all four words of a quad word read at the maximum 
data rate of the CPU. 

Note that the Bus Exchanger may be used as a data transceiver by holding 
the latches open. In this case, the two phases of the read operation are 
compressed into a single activity. This is how the R3721 uses the Bus 
Exchanger during single read operations, banked memory configurations, and 
for the first word of an even-odd pair in an interleaved memory system. 


MEMORY WRITE OPERATIONS 

The R3721 always uses the Bus Exchanger as a simple transceiver during 
write operations. Thus, CPU data is never latched into the Bus Exchanger write 
latches. 

The R3721 uses the T/R, Path, and OEU/OEL control signals to properly 
steer processor data into the DRAM arrays. The write operation is selected 
by driving T/R high. Writes are thus performed using the Path input to select 
the memory port (Y or Z). The LEXY /LEXZ are not used, and thus are tied high 
to enable CPU data to flow through the latches. 
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PIN DESCRIPTION 

This section describes the signals used by the Bus Bectanier More detail 
on these pins may be found in the IDT73720 data sheet; more detail on the 
R3721 interface can be found in other chapters of this manual. Note that 
signals indicated by an overbar are active low. 


X(0:15) I/O 
Bidirectional Data Port X. In an R3721 system, this is connected to the 
CPU’s A/D (Address/Data) bus. 


Y(0:15) I/O 
Bidirectional Data port Y. In an R3721 based system, this is connected to 
the even path or lower bank of memory. 


Z(0:15) I/O 
Bidirectional Data port Z. In an R3051 based system, this is connected to 
the odd path or upper bank of memory. 


LEXY I/O 

Latch Enable input for Y-Write Latch. In an R3721 system, this is tied 
high. 
LEXZ I 

Latch Enable input for Z-Write Latch. In an R3721 system, this is tied 
high. 
LEYX I 


Latch Enable input for the Y-Read Latch. The Y-Read Latch is open when 
LEYX is high. Data from the even path Y is latched on the high to low transition 
of LEYX. In an R3721 system, this is tied to the XYLEn output of the R3721. 


LEZX I 

Latch Enable input for the Z-Read Latch. The Z-Read Latch is open when 
LEZX is high. Data from the odd path Zis latched on the high to low transition 
of LEZX. In an R3721 system, this is tied to the XYLEn output of the R3721. 


PATH I 

Even/Odd Path Selection. When high, PATH enables data transfer between 
the X-Port and the Y-port (even path). When low, PATH enables data transfer 
between the X-Port and the Z-port (odd path). 


T/R I 

Transmit/Receive Data. When T/R is high, Port X is enabled to transfer 
(write) data into the memory port specified by PATH. When T/R is low, Port X 
is enabled to receive (Read) data from the memory port specified by PATH. 


OEH I 
Output Enable for Upper byte. When low, the Upper byte of data is transferred 
to the port specified by PATH in the direction specified by T/R . 


OEL | I 
Output Enable for Lower byte. When low, the Lower byte of data is transferred 
to the port specified by PATH in the direction specified by T/R . 
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